DOCUMENT RESUME 



ED 223 711 



TM 820 858 



AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 
PUB DATE 
GRANT 
NOTE 

PUB TYPE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



De Avila, Edward; And Others 

A Neo-Piagetian Approach to Test Bias: Final 

Report. 

De Avila, Duncan apd Associates, Inc., Larkspur, 
CA. 

National Inst, of Education (ED), Washington, DC. 

31 Mar 82 

NIE-G-79-0155 

203p. 

Reports — Research/Technical (143) 
MF01/PC09 Plus Postage. 

Cognitive Processes; Cognitive Style; *Cognitive 
Tests; ^Developmental Stages; Elementary Education; 
Ethnic Groups*; Intentional Learning; Performance 
Factors; *Test Bias; *Test Coaching; *Test 
Wisen^ss ■ 
*Neo Piagetian Theory; *Raven Progressive Matrices 



ABSTRACT 

This project examined the hypothesis that different 
background experiences associated with cultural grouping may lead to 
differences in test-taking strategies which result in score 
differences extraneous , to the abilities the test is intended to 
measure. Its purposes were to confirm (or disconfirm) the cultural 
differences hypothesis and to provide a systematic basis for reducing 
this potential source of test bias and invalidity. Subjects were 810 
Anglo, Black, and Mexican-American students in grades 2, 4, and 6. 
The test used was the Raven Progressive Matrices. The relative level 
of culturally related bias was predicted for each item a priori, 
based on level of complexity. Results indicated that although group 
differences on the test are related to developmental level, they are 
also related to test-taking skills. Test-taking skills are a major 
source of variation; they are learned and can be strengthened through 
exposure to specific requirements for the test. These results 
indicate that an importaa^source of bias is as much in the overall 
testing procedure as infthe t'est itself, and challenge the assumption 
that all children approach and solve a test-taking task in the same 
way. (Author/PN) 
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CHAPTER 1 * • 

Introduction and Theoretical Position 
Several researchers have at-tempted to provide theories of psychometric 
test performance which would account for group differences. Probably the 
most widely known psychometrically based^ theory is Jensen's (1973) Two 
Level Theory of Mental Abilities. The theory is described by Jensen as 
distinguishing'between abilities irwolv.ing th| capacity to receive,, regis- 
ter and store stimulus information for later recall (Level I) and abilities 
which involve the transformation, manipulation and integration of stimulus 

information prior to recall (Level II). According to Jensen, tasks which 

* 

rely on Level I abilities (primarily) Involve rote learning, dlgjft span and 
other types of simple associative learning. Level II abil ities,, on the h 
other hand, are involved in tasks like the Raven's Progressive Matrices, 
Block Design of the WISC and other standard intelligence tests. Jensen 
argues that group differences (especially Black-White) are due more to d1f- 
ferences in Level II abilities than to differences in background experi- 

ences. He bases this Interpretation on the pbservation that minorities 

* 

(Black and Mexican-American) 1n general perform poorer than Anglos on Level 
II type tests while performing the same or better on Level I tests (Jensen, 
1977).' 



Questions have been raised concerning processes involved in test per- 
formance (Das, 1973a; Das, Kirby & Jar^an, 1975; Hunt, 1974; Jensen, 1979; 
Rohwer, 1971; Sternberg, 197*8). Hunt (1974), for example, concludes that 
there are two qualitatively different ways to solve Raven's Progressive 
Matrix Problems (a test used by Jensen as an example of "Level II" abi 1 i - 
ties) wtrtx^riis^ processes. Hgnt reports that even 

Spparman noted this, but felt that only one of the processes for solution 
is related to his. general intelligence or <[ factor (Spearman & Wynn-Jones, 
1951, as reported by Hunt, 1974, p. 154). Hunt further notes that simi'lar 
scores and similar pattew^'of correct and incorrect responses can be at- 
tained on Set I of the Raven Progressive Matrices, using either process, 
an.d thus lead one to believe (via factor analysis) that # the test is measur- 
ing the same £ factor under either manner of solution" (in contrast to 
SpeaPman's thinking). 

As a consequence to this and other issues, other researchers have pro- 
posed differences in information processing as an explanation for group 
differences in test performance (e.g., Case, 1975; fras, 1973b; Rohwer, 
19^1). Das (1973b) proposed an information processing model , described ^ 
earlier by Luria (1966), based on "simultaneous" and "successive synthe- 
sis." Das contrasts his model with Jensen's stating that the levels model 
does not take into account individual differences in the tendency to employ 
different processing, strategies. 

Along similar lines, but involving different processes is a model pro- 
posed by Rohwer (1971). Rohwer suggests that group differences can best be 

explained in terms of an interaction between the nature of the task (i.e., 

\ - 

whether it requires the recall or the application of information) and the 
individual's propensity to utilize either a formal conceptual process 



(involving the ability to apply a well-defined set of strategies or ryles) 
or an imaginative process (involving the capacity to depart from formalized 
conventions in a test situation). Rohwer suggests that group differences 
occur because minority individuals have not Jhad the same opportunity to de- 
develop" elaborative and conceptual processes to the same degree as majority 
individuals prior to entering ^ 

Bgth Das (1973a) and Rohwer (1971) propose models which attempt to 
take into account individual differences in the propensity to employ vari- 
ous processing strategies. An important distinction/ thOT, between these 
models and that of Jensen is that they admit to the possibility that per- 
formance on a task can be as much a function of the individual's own idio- 
syncracies (e.g., choice of processing strategy or cognitive style) as it 
is determined by the nature of the task. * 

At the same time, many researchers have been skeptical of psychometric 
conceptions .of intelligence because of their failure to be based on any 
theory of^cognitive abilities (e.g., De Avila, Havassy & Pascual -Leone, 
1976; De Vries, 1973; Hathaway, 197(3; Hunt 1974; Kohl berg & De Vries, 
(1969). Hunt (1974) states: "It is inadvisable to have a technology for 
measuring individual differences which stands apart from a theory of cogni- 
. ti'on" (p. 130). 

-V,' Case (1975) suggests that SES differences are due to differences in 
executive repertoire of cognitive processes, 0 £ather than information pro- 
cessing capacity. Using a neo-Piagetian approach (Pascual-Leone, t969, 
1970), Case states that performance on Piagetian tasks is affected by the 
subjects 1) repertoire of executive strategies, 2) cognitive style (e.g., 
Witkin's, 1950, field differentiation construct), and 3) M-space, the 
amount of information that the individual can coordinate simultaneously. 



Case suggests that 1t is with 'respect to differences in executive reper- 
toire* of -strategies available, which are due to experience, that groups 

9 

differ. 

* Such an approach represents a more comprehensive consideration of the 

issue concerning group differences, the extent of which is well expressed 

in cross-cultural research (e.g., Buss, 1977; Cole, 1975; Cole 4 Bruner, 

1971; Cole & Gay, 1972, Cole, gay, Glich & Sharp, 1971). F,or example, in 

reference to the relationship between "psychological environment" and group 

differences, Buss explains: 

There may well be cross-culturally invariant processes (as' 
identified via organismic factors) while at th& same time 
there may also be cross-cultural differences in the learning 
situations (and hence in the environmental dimensions) in 
wfiich these invariant processes are applied (Buss, 1977, 
p. 204). 

Thus, at the present time it is safe to say that it is simply not 
known if groups differ in the processes they use to solve a task or whether 
the use of different processes still means that the same thing is being 
measured. However, as noted by Jensen (1979), it does appear that'll cogni- 
tive processing approach may yield more information concerning what is 
involved in test performance for individuals in general. Additionally, it 
is likely that such an approach may also shed more light on the issue of 

A 

group differences in test performance. At the very least, these possibili- 
ties need to be explored. 

Information Processing Capacity *and Task Complexity 

The questions raised above need to be laken into account when onG con- 
siders th,e theoretical issues involved in test performance (e.g., see 
Tuddenham, 1972) aside from those involved in explaining group differences. 
If we are to understand what a test measures, then we should first know 



what processes are Involved in test performance. In this respect, Hunt 
(1974) suggests that the style of processing one chooses may be associated 
y**ith the information^ demands of the task. He thus concludes that we should 
"look carefully at the information processing demands of an intelligence 
test before we decide what the test measures" (p. 130). 

A similar conclusion can be reached regarding a comment by Jensen 
(1980) in which he states that if a learning task (and presumably a test 
item) "is too complex, everyone, regardless of his IQ, flounders and falls 

• ■ ' • * 

back on simpler processes such as trial -and-error and rote associations" 
(p. 28). Jensen (1979) also points out that increases in task complexity 
are accompanied by increases in £ loading. He states: "(I)t is the task's 
complexity rather than its content per se that is most related to £" (p. 

18). Finally, in the same article he makes the fol lowingo important 

* ft ■ . 

comment: 

At present, it seems safe to say, we do not have ci real 
theory of £ or intelligence, although we do know a good 
deal about the kinds of tests that are the most £ loaded 
and the fact that the complexity of mental operations 
* -called for by a test is related to £ (Jensen, 1979, p. 19). 

Recently, a model whereby the processing demands of a task can be de- 
temnned and compared with the processing capacity of the individual has 
been developed (Pascual -Leone, 1969, 1970; see also Case, 1972* 1974). 
Pascual -Leone's model consists of a construct which he terms Mental* 
Operator (M). According to him, M represents the magnitude of the indivi- 
dual's central computing space or M-space, and he defines it, as the maximum 
number of schemes that can be coordinated at any one time (Case, 1972). 
Pascual-Leone (1969) argues that M is the basic organismic variable under- 
lying psychometric intelligence (i.e., Spearman's C 



Several investigators have attempted td apply M directly to psychome- 
tric measures of Intelligence. For example, Bereiter and Scardamalia 
(1979), compared the Figure Intersection Task (FIT, Pascual -Leone, 1969) 
with the Raven Progressive Matrices in an Anglo sample and were able to 
predict average test performance in both directions. They concluded that, 
for the most part, the FIT and the Raven test were measuring the same 
general 'construct. 

Finally, Bachelder and Denny (1977a, 1977b) presented a theory of in- 
telligence based on. an individual's span ability . Span abilityis describ- 
ed much the same as Pacual -Leone's M construct. Bachelder and Denny are 
careful to note that the best measures of span ability are those that in- c 
volve more complex operations as opposed to simple rote abilities; §nd 
which do not allow the subject time to activates cognitive strategy. It 
is interesting to note the- definition of intelligence provided by Bachelder 
and Denny: 

Intelligence is the total-set of individual difference 

variables that interact .with difficulty or complexity. 

The more complex the task the more intelligent one needs 

to be to perforth the task. When the task is extremely 

simple, intelligence is not a relevant variable (1977a, p. 128). 

» 4 

Thus, they state that span ability, (1 ike M-space) conforms to their 
definition of intelligence since it interacts with task complexity. 

The idea of an individual's information processing capacity as a set 
measure of intelligence *is not new (Pascual.-Leone, 1969). ^What is new is 
that it is only now being integrated within the framework of psychometric 
test performance. The fact that researchers are beginning to use this mea- 
sure, and that it appears regularly in regard to what the best measures of 
psychometric intelligence have in common, suggests that processing capacity 
may offer a more interesting and rewarding measure of intellectual ability. 



■ ' * * 7 

In addition, information processing capacity and task demands are more 

amenable to experimental manipulation and control (e.g., see Case, 1974, 
1975)., 

Task Complexity and Culture-Loading 

^ The use of complexity as the common characteristic shared by tests (in 
s varying degrees) offers a different interpretation of test performance than 
does -j or general intelligence. Most important is that complexity is not 
fixed (as is assumed in classical test theory), but can vary relative to 
the group or persons attempting the test or item — that is, it is group 
specific. For example, the division algorithm is a highly complex task for 
fourth and fifth grade students. However, for an adult who has over- 
learned the algorithm and is able to process the task requirements in 
larger units, it is not as complex. 

Similarly, the complexity of many cognitive tasks can vary according 
to the processing strategy used (e.g., Case, 1975). An example of how a 
task's complexity can vary is provided, in Figure 1. 

,A ' ~ 

Figure 1. 

"Find the one object that is like the model object in color and size." 
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The task in Figure 1 1s to find which of the three objects is like the 
model object in terms of color and size. Adults and many children attempt 
to solve the problem by using a globaJ or simultaneous processing strategy 
-- i.e., to solve the problem directly by finding the object that is the 
same color and size. This, requires simultaneous processing of both crite- 
ria and the distracting cue provided by shape. For most adults, the task 
does not provide too much difficulty, although some adults will still make 
mistakes. However, since they cannot process this much information simul- 
.tanebuly, this strategy often makes the task too complex for young child- 
ren. ^ < 

An alternative processing strategy is to employ an analytic or succes- 
sive processing strategy. In this strategy possible response items are 
eliminated on the basis of whether they satisfy the first criteria (color). 
Those remaining are eliminated on the basis of whether they satisfy the 
second criteria (size). In this way, only one piece of information is pro- 
cessed in each step. Moreover, the distracting cue provided by shape is 
not even processed at all. Obviously, the second strategy is the preferred 
one since it reduces theTtask's c^plexlty^to^Hevel^hto 
by most first grade children. 

This example illustrates that a task's complexity is not necessarily 
fixed, unless one can assume that all subjects will be using the same pro- 
cessing strategy. This 1s why measures of information processing capacity 
avoid items which allow "chunking 11 or which can be overlearned through pri- 
or experience^ (e.g. , Case 1975; Bachelder & Denny, 1977a). Thus, 1n the' 
digit-span tests, strings such as 1234 or 1980 are avoided. Bachelder and 
Denny (1977a Jr^fctso caution that the rapidity of presentation of Items to be 
recalled should be set at a speed such that no time is allowed for activa- 



tion of a processing strategy (such as rehearsal) that would reduce the 
complexity of the task. 

Information Processing Capacity and Cognitive Style . 

Processing strategy can also be influenced by the cognitive style of j 

- • I 
the subject (Pascual -Leone, 1969; Case & Globerson, 1974). For example, J 

I 

Pascual -Leone (1969) has shown that Witkin's (1950) cognitive style con- j 
struct of Field Dependence/Independence is related to a subject's tendency 

to utilize maximum information processing capacity. On tasks which require 

f 

a complex conceptual response, field-dependent subjects tend not to perform 
as well as field-independent subjects. Case and Globerson (1974) suggest j 
that the "kind" of field-dependent subject may be an important factor anffljp 
make a distinction between the subject who is field-dependent becau * s/he 
uses little processing capacity and the subject who is field-dependent 
because s/he is overly sensitive to misleading gestalt-like cues. The 
latter kind of field-dependence may be the result of unfamil iarity with the 
elements of the tas*k and hence be attributable to differences in 
experience. 

\ * 

Group d ifferences in cognitive style have been cited by many research- 

7 t*p^~ - , 

ers (Castaneda, 1976; Laosa, 1978; Laosa & De Avila, 1979; Ramirez & 
Castaneda, 1974; Ramirez, Castaneda 4 Herold, 1974; Ramirez |4 Price- 
Williams, 1974), and the theory has been used to explain differences in 
performance on Piagetian Conservation Tasks (Case 4 Pascual-Leone, 1975), 
Information processing capacity (Case 4 Globerson, 1974), and other reason- 
ing tasks (Ramirez 4 Castaneda, 1974). While others have argued that cog- 
nitive style differences are in general related to culture (e.g., Ramirez 4 
Castaneda, 1974), research concerning group differences has been equivocal 
(Oe Avlla 4 Duncan, 1979). Moreover, UHbarri and Flemming (1980) reported 



results which contradict the cultural difference hypothesis and suggest 
that cognitive style resembles more of an individual trait variable (as 
opposed to an individual difference variable) in that it tends to be a 
function of natural experience and simple familiarity with the task situa- 
tion, indicating a tendency to be overly sensitive' to misleading cues, than 
to a failure to utilize maximum processing capacity. In this sense and for 
certain tasks, cognitive style is ^almost a direct measure of culture- 
loading. 

Culture-Loading and Test Bias 

The results reported above are especially relevant to the issue of 
culture-loading anjpest bias (see Jensen, 1980, Ulibarri, 1982), since, if 
it is the case that test bias means there is something in a test which 
makes it easier for one group than for another, then it 5s possible that 
this "effect" could be detected whenever differences in processing strate- 
gies occur ?uch that differences in the amount of information that must be 
processed is likely to be affected. If this is the case, it would imply 
that certain tests are actually more complex for certain students. 

This interpretation is consistent with the findings of greater group 
differences on tVsts that show the highest ^-loadings --i.e., are the most 
complex (Jensen 1979). If individuals or groups are using different proc- 
esses to solve a task, and if the task represents different levels of com- 
plexity because of this, then different levels of c[ performance could be 
affected. That is, so-called £ could be a function of the test's complex- 
ity as revealed by the group taking the test. In this way, observed group 
differences on ^-loaded tests may really reflect differences in task com- 
plexity. Thus, if a test is more complex for one group (i.e., requires a 
more information to be processed), then one would expect 1) higher 



^-loadings and, 2) greater group differences. These results could be due 
to differences in processing demand for different groups rather than to 
differences in ([-ability. 

Stated simply, one interpretation of culture-loading is that a task or 
test item is culture-loaded if different cognitive processes or processings 
strategies are likely to be used and if it has the effect of either 1) in- 
creasing the number of discrete pieces of information that must be process- 
ed by the specific group for whom the test is thought to be biased (e.g., 
by providing cues which either increase the raw number of piecfes of infor- 
mation that must be coordinated, or inhibiting the formation of an execu- 
tive processing strategy), pr 2) decreasing the raw number of discrete 
pieces of information relative to the group the test is thought to favor 
(e.g., by providing cues which either reduce the amount of information to 
be processed, or activating executive processing strategies which aid in 
coordination of the information to be processed). More specifically, if 
greater processing demands are required in order for minority subjects to 
attain the same level of performance on an item as majority subjects, who 
are equal in processing capacity, then the item or task would be said to be 
culture-loaded. * 

In the following, we will describe and discuss a study designed to 
test the above loosely stated hypothesis that group difference can, to a 
limited extent, be explained by differences in the way in which children 
from different racial/ethnic groups approach or solve tests differing in 
levels of complexity. 



CHAPTER II 
Design of the Study 

Methodology 

The basic methodology for this study involves a comparison of the rel- 
ative contribution of four cognitive/developmental measures of performance 

* 

on a standard criterion measure of "analytic intelligence and to compare 
this relative contribution between groups of subjects who received training 
designed to provide the necessary executive repertoire, relevant to perfor- 
mance on the criterion^measure with subjects who did not receive such 
training. 

; The basic hypothesis of the study 1s that children differ in the like- 
lihood of applying the desired cognitive processing strategy, and that When 
this factor is controlled (through training) , performance on the criterion 
measure is likely to be more similar across ethnic groups. Additionally, 
1t is hypothesized, that such differences result in cultural bias to the 
extent that different processes are being measured. Stated in another way, 
cultural -loading on a testes said to occur whenever a test is measuring 
different aspects of performance. That is, when the assumption that all 
children taking the test are applying the same processing strategies is not 
met, that the test is not measuring the same thing in each group. 
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Subjects 

The study was conducted in the Northern California Bay area. Subjects 
for the study consisted of 134 Black, 83 Hispanic and 74 Anglo children in 
the fourth and fifth grades. In general ..each school in the particular 
district tends to be composed of one ethnic group. Thus, in order to 
obtain adequate samples of all three ethnic groups, it was necessary to in- 
volve four schools in the study. Additionally, the schools tend, to be lo- 
cated in different parts and hence different socio-economic segments of the 
city. For example, 'Hispanic students tend to be concentrated in schools in 
or near the flatland regions of the city, Black students tend to reside 
nearer the hills, and Anglo^ students are concentrated more in the eastern 
hills of the city. Busing provides an additional dimension to the diver- 
sity of ethnic make-up in the schools. Nevertheless, in most cases; stud- 
ents of different ethnic groups were not from the same school. In addi- 
tion, the schools and the student populations are diverse on too many other 
dimensions besides ethnicity to consider the group comparable per se. 
Thus, it was not possible to match the backgrounds of the students so that 
a direct between group analysis would be interpretable. 

Table 1 shows the number of subjects and average age by school, sex, 
and race in training and control groups. The number of children in the 
' training group was determined on the basis of the number of trainers avail- 
able and other logistic constraints present in each school. Selection of 
the Students for the training group was based on random assignment. 
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TABLE 1 



Average Age* and Number of Subjects 
o by School, Sex and Race in Training 

and Control Groups 



CONTROL 






S c hoo 1 




Black 


HI span 1 c 


A m *m 

An g 


1 0 










Male 


Fema i e 


Male 


Few a 1 e 


Male 


Fema le 








Ag e 


122.5 


121.9 


132.7 


126.4 


126.0 


1 13.0 




r 


i 


9U 


6.94 


6.31 


10.69 


6.93 


-0- 


-0- 








N 


12 


22 


6 


9 


1 










Age 


128.0 


126.3 


1 15.5 




125.7 


125.7 






z 


en 


6.42 


3.79 


4.95 




7.48 


3.2(D 








N 


6 


4 


2 


-0- 


14 


4 








Ana 


126.9 


128. I 


123.9 


123.8 


120.5 








3 


SD 


10.3 


7.4 


7.38, 


1 1 .88 


3.54 








e 


N 


8 


7 


9 


15 


2 


-0- 








An e 


120.3 


120.4 


125.7 


125.7 


127.27 


123.7 






4 


SD 


3.98 


6.37 


8.31 


1 1 .37 


8.56 


9.95 








N - 


6 


8 


6 


3 


1 1 


7 






TOTAL 


Ane 


123. 


6 


125.5 


125.7 








SD * 


6.92 




9. 


32 


7. 








. t 


N 


73 




50 




40 














TRAINING 




V 










Age 


133.0 


122. 1 


122.0 


125.3 




127.0 






1 


SD 


12.16 


5.63 


5.66 


7.78 




5.66 "1 








N 


3 k 


13 


4 


10 


-0- 


2 








Age 


129.3 


125.3 




131.0 


. 120.3 


128.2 






2 


SD 


11.04 


.8.63 




-0- 


9.01 


8.23 








N 


- 6 


9 


-0-. 


L 


7 


6 








Age 


124.4 


122.7 


126.2 


125.7 


130.3 








3 


SD 


11.72 


8.60 


7.48- 


0.58 


8.50 










N 


9 


9 


12 


3 


3 


-0- 








Age 


- 120.5 . 


122. 1 




122.5 


- 126.2 


124. 1 






4 


SD 


7.32 


6. 13 




~T~2702 


—8^64 


_9.86 






TOTAL v 


N 

Age 


4 8 

124.2 


-0- 2 
126.2 


5 10 

125.1 








SD . 


8.73 




7. 


46 * 


8. 


99 








N 


61 




33 




34 









Age in months. 
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Procedure v , 

The procedure for conducting this study involved identifying Black, 
Hispanic and Anglo students in the fourth and fifth grades who would vol- 
unteer to participate in the study and for whom parent permission was ob- 
tained (approximately 85 to 95X). Four schools in a Northern California 
Bay Area School district were identified on the basis of ethnic composir 
tion. Students within each school were assigned to either a training or 
control group through the use of a random numbers table. 

In the early spring of 1980 all subjects were administered four tests: 
a measure of information processing capacity, a measure of cognitive style, 
a measure of sensitivity to salient but misleading stimuli, and a neo- 
Piagetian measure of intellectual development. With the exception of the 
measure of cognitive style, all tests were administered in a group situa- ' 
tion consisting of approximately IS students per test administrator. The 
testing sessions lasted approximately one hour each. The information pro- 
cessing test and the measure of sensitivity to perceptual pull were admin- 
istered together. The information processing test was administered first, 
followed by the measure of sensitivity to perceptual' pul 1 . The neo- 
Piagetian developmental test was generally administered in a separate group 
session. In some cases however, the test-was administered with the two 
measures described above. The individually administered measure of cogni- 
tive style was administered after all other tests were completed. 

Make-up tests were conducted for all students absent rfuring the reg- 
^TTar^^sMrvg-^ no knowledge whether a stu- 

dent belonged to the training or control group ^ur-ing^^^ testing. • 

Following the initial testing period, students in the r training group 
participated in eight one-hour training sessions conducted over a two week 



period. Again, make-up was provided so that all training group students 

% ° ■ 

completed the eight sessions. 

Approximately one month after the initial pre-training tests, all sub- 
jects were tested on th^ Ravens Standard Progressive Matrices according to- 
the test publisher recommendation. Students in the-training and control 
groups were tested in the same sessions. 

For the Hispanic group, Spanish translation's provided for each of 
the t^sts and during* the training sessions. The following will describe 
the tests and the training procedure * 
Criterion Tests 

■ * 

Information Processing Capacity . The Figure Intersection test (FIT) 
"was used as the criterion measure of processing capacity (Bachelder & 
Denny, 1977b; Pascual -Leone 1969). In the FIT, students are provided with 
training on how to take the test. That is, to find the intersection of 
vari ^^overlapping shapes. During the pre-training children are taught 
first that size, orientation ahd juxtaposition are irrelevant factors; 
shape is the *on!y relevant dimension. Second, they are taught to put a dot 
in each shape that appears on the top-half of the page and then to put one 
dot where the same shapes are shown overlapping on the bottom of the page. 

The FIT consists of seven subscales ranging from two to eight shapes. 
It has been shown that a subjects 1 ability to find the intersection is lim- 
ited according to the number of shapes but increases 1 inearly with age 
(e.g., De Avila & Havassy, 1974; Pascual -Leone, c 1969; Ulibarri^ 1974). 

A Guttman analysis for the group in this study yielded a coefficient 
of reproducibility greater than .90 for' all groups as well reliabilities \ 
(alpha) from .91 to .94. 



Cognitive Style . The Children's Embedded Figures Test (CEFT) was * 
adapted by Karp and Korstadt (1963, in Witkin, Oltman, Raskin & Karp, 1971) 
as a measure of perceptual disembedding. The test requires students to lo- 
catena previously seen simple standard figure within a larger complex 
figure. A score is determined by the number of first correct choices made. 
Higher scores represent greater field independence. The. task requires the 
subjects to ove^cgca^misleading cues provided by the larger, more complex 
figure. The more independent a subject is from the background or field 
provided by the larger figure, the more field independent the subject is 
said to be (Duncan & De Avilay 1979). , 

Sensitivity to Misleading Cues . The water level test (WLT) (Pascual 6 
Leone, 1969, 1970) is a neo-Piagetian measure of cognitive development 
(Piaget & Inhelder, 1948). Pascual -Leone (1970) has shown the task to be 
highly related to both information processing capacity and cognitive style. 
The. test consists of a series of illustrated bottles against a three- 
dimensional rectangular background. v Subjects are told to pretend that a 
picture bottle is half-full, to draw a line showing vghere the water-level 

would be and to mark an "x w where the water would be in the bottle. The 

«. 

test contains three subscales consisting of two-dimentional vertical 
(right-side up- and up-side down) bottles and tilted bottles, and three 
dimentilDnal vertical and tilted bottles. A subjects' score Is determined 
according to deviations from the correct water-level line and correct 
placement of the location of the water 1n the bottle. The test is reported 
in De Avila, et al. (1976), Ulibarri, (1974),, and Pascual -Leone, (1972). 



Developmental Level . The Cartoon Conservation Scales (CCS) (De Avila, 
1977) is neo-Piagetian measure of intellectual development devised from 
Piagetian theory. 

The CCS is made up of eight subtests consisting of 4 items each for a 
"total test length of 32 items. Each item of a particular subtest measures 
the same concept, each in a slightly different way, by picturing different 
materials. The eight subtests are listed below in order of increasing dif- 
ficulty. • , 

1. Conservation of Length 2. Egocentricity/Perspecti ve 
3. Conservation of Number 4. Horizontal 1ty of Water 
5. Conservation of Substance 6. Conservation of Volume 
7. Conservation of Distance 8. Probability 
The Cartoon Conservation Scales consists of a cartoon-like layout, 
with the problan presented in three frames on the upper portion of the page 
and three alternative answer frames located on the lower half of the page. 
In the first, frame of the problem set, an equality or inequality is estab- 
lished. In the second frame, an identity transformation takes place, anjl 
in the third frame, a question of equivalence or inequivalence is posed. 
Three possible answers are based upon the most frequent incorrect responses 
given by children of ttiis age group. The position of the correct response 
is varied across the cartoons. Also, the correct response was varied be- 
tween "yes" and M no M in order to minimize t|ie effects of "yea sayings." 
Different content was used for each presentation of a concept (e.g., in the 
conservation of substance, one item used clay as the material and another 
used beans)] 

Strictly speaking, the Egocentridty/Perspective scale items are not 
conservation task". Nevertheless, previous research has shown egocentrici- 

2sj 



ty itemsare excellent predictors of conservation and early formal opera- 
tions. In an egocentri city/perspective item, children are asked to deal 
with the problem of shifting perspective or point-of-view as represented in 
three dimensional space. 

r 

Analytic Intelligence 

The- Raven's Standard Progressive Matrices was the criterion measure of 
analytic ability., The test has been used extensively in the literature and 

so is only^iBfly described here. According to Jens'en (1974a) the Raven 

»■./■. 

test is a relatively culture-free test. The. test consists of- 5 subscales 
of 12 iteris each. The task is to identify the missing element out of a 
possible six or.eight alternatives. Each item consists of a- pattern or 
sequence of figures. "The subject must determine (i.e., abstract) a general 
JCidAJwMch, -when applied, will lead to selection of the correct 'response 



from theJ)Ossible arUernatives. BachHcler and Denny (1979) have shown this 
test to be hic^Ty correlated with the Fn^about .71). > 
Training P 

The following is a brief >overview o of the training procedures. Follow- 
ing tM*s, a more detailed description of each of the training exercises is 

presented. The purpose, of the training was to provide the children with 

% •■ 

the requireck'executive (l:e., cognitive) repertoire of experiences »neces- 
sary to perform on the Raven's progressive materials (e.g., see 

• 0 f 

Feurenstein, 1979). This is analogous to what Oe Avila and Havassy (1974) 

?./►■"■ 
termed experimental repertoire control and what Spearman and Wynn-Jones 

(1951, in Hunt, 1974) termed fundaments, that is, controlling for factors 

considered relevant to taking a test.' Without such control , there is a 

question of whether the test is be.ing, administered fairly, or to put it 

another way, whether the test is likely to measure the same thing for all 



children taking the test. Generally, the requirement that all children be 
engaged in the test in the same way, and that they have had equal exposure 
and opportunity to learn the prerequisite for taking a test, is assumed, 
o f Ten test "administrators and eight trainers were used for the study. 
Administrators and trainers received training on how to administer each of 
the tests, and on how to conduct the training. The training lasted approx- 
imately three full days and consisted of practice taking and administering 
each Q.f the tests. All test administrators and trainers were college stu- 
dents and some held graduate level degree, all but one had prior experience 
testing young children. 

The training consists of 12 paper and pencil exercises (see Appendix 

* ... 

A) administered over an eight day period. ' The training varied in time from 
one-half hour to one full hour. The size of the training groups varied 
from 8 to 12 students per trainer. In each case" children were required to 
work on each exercise until it was completed. Only then, were they allowed 
to cAntinue on to the next exercise. Some children moved faster than 
others, but in no case was' a child dropped from, the training. Children 
completing the exercises with little difficulty were simply excused while 
additional help was provided to others. The exercises are summarized in 
Table 2 together with the day-to-day schedule. Following Table 2 is a more 
detailed description of the training exercises. The training exercises are 
based primarily on the work of Feurenstein (1979) and borrow heavily from 
his research and training. fj 

Day 1 (Pre-Training): The implementation of the exercises began with 
a review of two of the pre-training tests: the CEFT and the WLT. While 
this activity was considered beneficial to the overall goals of the train- 
ing, its basic purpose was to get to know the children and to point out the 



« TABLE 2 

Summary of Training Activities and Skill or Problem Area Addressed 



Day 


Exercise - 


Skill or Problem Area Addressed 


1 


Review of CEFT and WLT. 


Demonstration of errors and effect 
of misleading cues. 


243 


1&2 - Mediated Learning: Subject must find the 
object that Is the same as the model object 
according to the criterion given (Subjects pro-* 
cee$) when they pass criterion test). Exercises 
2 and 3 are the same but Increase In complexity 
(I.e. ^criteria). 


Ability to categorize, gathering * 
information from two sources, apply- 
ing analytic processing strategies, 
focusing on task Instructions, de-? 
fining problem. Ignoring irrelevant 
but sallant visual Information and 
Inhibiting Impulsive behavior. 


4 


344 - Dots Training Sheet: Subject must connect 
seven dots to complete a square and triangle 
shape. 


Practice In visual transport, form- 
ing visual structures, using plan- 
ned behavior, organizing Informa- 
tion, gathering precise data, over- 
coming distracting cues, and forming 
wholes from parts. j£ 


5 


5&6 - Figure Completion: Subject must find the 
part that Is missing and complete the figure. 
Pattern Completion: Subject must complete a 
pattern to look Identical to a model pattern. 


Practice In visual transport, com- 
pleting patterns, and paying atten- 
tion to detail. Acuity In visual 
perception, comparative behavior, 
end pattern recognition. 


6 


748 - Combining Patterns: Subject must combine 
patterns In either an additive or subtract Ive 
manner. 

. \ 


Visual transport (more complex), 
combining pattern features, acuity 
in visual perception. Abstracting 
relationships, applying relation- 
ships. 


7 


9 - Analogies: Subject must abstract the rela- 
tionship and apply It to complete the matrix. 
Analogies criterion test. 


Transter of learning to unique 
problems. 


8 


I0&II&12 - Two by Three Analogies: Three by 
Three Analogies (Matrices). Matrices* criterion 
test. 


Abstracting relationships from two 
sources of Information in two direc- 
tions, applying anal ytlc processing. 
Transfer of training to novel prob- 
lems. 
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types of errors arid answers given by the children. The CEFT and .WLT tests 

were chosen because of the nature of the tests in terms of providing mis- 
p- 
leading but salient cues. An actual bottle half-filled with water was also 

used as a demonstration of the WLT task. 

During this pre-training period children were asked to provide their 
own answers and to discuss them with each other. For the WLT task, chil- 
dren were asked to go to the blackboard and draw their solution (i.e., the 
water level line and location of the water in the bottle). Alternative so- 
lutions were also asked for until about three or four different solutions . 
were obtained. A tally was then made to see which solution was preferred. 
The actual bottle provjdecl the correct answer, to the surprise of many 
children and one adult observer. 

Day 2 & 3 (Mediated Learning): This was by far the most extensive 
part of the training and will be explained in some detail so that the read- 
er can get a flavor for the training. The first part of training consisted 
of an exercise condueted-by the adult trainer in interaction with the J 
children. 

Mediated learning means that there is an interaction between thp 
trainer and the student. That is, the child actively participates in the 
training. The trainer merely acts as an "adult" mediator who is there to 
provide direction, place emphasis on certain features and point out errors 
and correct responses. Thus, the trainer must see that each child per^rms 
the task in the context of a discussion on how to solve it. In each tas* 
the child first attempts to solve the problem, then it is solved by the 
group. Incorrect responses are crossed out and, correct responses made. 
The training is designed to develop in the child the appropriate experi- 
ences for dealing with the task v 
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The first exercise consists of 25 cards (see Appendix B). Each card a 
has„ a model figure in the upper left corner. Under the model figure is.a^ 
particular dimension label such as "color 11 , "shape", "size" and "pattern," 
The task Is to match or find the object that is most like the model accord- 
ing to the dimension (i.e., criterion) given. Each child 1s checked to see 
that the task has been successfully completed before proceeding to the next 
card. The cards are in the following order: color (items), shape, size 
and pattern. (Extra cards are available for each criteria in case a child 
has. difficulty.) The children are told the directions and then asked to 
name the dimension or criteria that is being looked for. The children then 
mark a ' "+" sign in the box next to the one they think is correct. The 
trainer then goes through each response eliciting from the child why it is 
correct or incorrect. After all the cards are complete, the children are 
given a criterion mastery test to check for transfer. It consists of three 
items from each dimension for a total of 12 items, When the children are 
able to pass all twelve, they go to exercise 2. TJie following is an exam- 
ple of the dialogue provided to the trainers: 

Directions : Say to the children* "I am going to pass out some 
booklets that are full of pictures of figures (shapes). In the 
top left corner (pointing) you will see a figure .(triangle) and 
a word printed below it (color). The game is to find the figure 
from below that isolike the figure in the corner. The word 
tells you how the figures should be the same. The rule for the 
game is to use the word to find the figure that is like the one 
in the corner. When you find the figure (shape) you should put 
a "plus" in the box below it. O.K., lets try one. Remember, 
look at the corner figure, and the word, then find the one that 
is the same according to what the word says.' Mark a plus in the 
box under the one you choose. O.K., the first one says color. 
We must find one from the bottom that is like the corner figure 
in "color". What is the color of the top figure? Right it's 
white. .So what are we looking for?, (If the children say "white 
triangle" 1 , correct them by asking If the word says triangle or 
just color. Emphasize that color 1s the rule (criteria) and 
that shape doesn't count; only what 1s given 1n the "word" 
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counts.) Have the children mark an answer, then proceed as 
follows. 

O.K., is the first one (point) the one we are looking for? 
(pause) No. Why? Right, because it's the wrong color (if a 
child says it is correct because it has color, point out that 
white is also a color and we are looking for something white). 
Next, say: what about the middle one? Right, It's the same 
color. But, before you make a "plus" we should check the last 
one just in case. Right, it's not the same color so it must be 
the middle one. So everyone put a "+" in the box below the 
middle figure. 

following the first exercise children are given a criterion test con- 
sisting of 12 items similar to those provided in the training. When chil- 
dren have completed the criterion test without error, they are given exer- 
cise 2. This exercise differs in that two dimensions (e.g., color and 
shape) are given as criteria. When this is completed without errors then . 
the children move to exercise 3. 

Day 4 (Dots training sheet): (from Feurenstein, 1979) The dots 
training sheet consists of a pre-training part and one exercise. In the 
pre-training children are first shown how to connect four dots to make a 
square and three dots to make a triangle. Next, the four dots and three 
dots are juxtaposed in the same picture frame and gradually shown close 
together in subsequent frames until they overlap. The dots forming the 
square are at first larger than those forming the triangle. By the last 
row of framed they are the same size. 

The difficulty of the task, of course, increases as the dots become 
the same Mze and as the dots forming the square and those forming the tri- 
angle overlap. Fol lowing training, children are given a mastery test con- 
sisting of 19 smaller dotted frames. For some children the task was merely 
a challenge, while for„others it was extremely difficult. 

The purpose of the dots task is, in part, to provide fun on an ini- 
tially easy task, and according to Feurenstein (1979) to provide the chil- 
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dren with experience 1n visual transport fprming visual structures, orga- 
nizing Information, gathering precise data, overcoming distracting cues and 
forming wholes from parts. However, criterion mastery of this task was not 
required. . 

Day 5 (Figure and pattern completion): These tasks can be found in 
Feurenstein (1979). They differ only in that additional figures are 1nr 
volved. In both tasks the problem is to complete a model (criterion) fig- 
ure. The difference 1s that 1n figure completion, a partially completed 
model figure (e.g., spare, circle, star) is provided together with alterna- 
tive "parts" of which only one completes the figure. The child must select 
the correct part. In the pattern completion task, a more complex rrtodel 
figure is shown together with a partially completed figure. The task 1s to 
draw in the missing parts so that the partially completed figure is similar 
to the model or criterion figure. The figure completion task contains pat-, 
terns that are found in the Ravens test. L 

Again, according to Feurenstein, these exercises provide practice in 
visual transport, flgural and pattern completion, paying attention to 
detail, acuity in visual perception and comparative behavior. Subjects 
completed 12 items on the figure completion and 8 items on the pattern com- 
pletion tasks. If errors were made, they were pointed out, and the student 
asked to do them over. 

Day 6 (Combining patterns and analogies): The purpose of this day's 
training was to provide children with experience 1n combining and subtract- 
ing patterns 1n order to obtain a new pattern. This skill or strategy 1s 
then applied to solving figure analogy problems. 

The combining patterns task consists of two types of Items (4 each). 
The first involves visually adding two patterns (i.e., overlapping) and se- 



lecting from four alternatives the one that would result. The second 
involves determining what pattern would remain if part of the pattern were 
removed. 

In the visual analogies task the child is presented with a 2 x 2 
matrix in which a figure is missing. The task is to select the missing 
figure out of six alternatives. The child must "abstract" a relationship 
from the three figures and apply it to one of the alternatives in order to 
select the correct answer. There are three patterns (i.e., relationships) 
consisting of. four items each for a total of 12 items. 

Day 7 (Analogies Criterion Task): This task is simply a more complex 
version of the previous analogies task. Eight items are given, each of 
which involves different patterns and somewhat different relationships. 
Subjects complete this task until reaching 100* mastery. It is the only 
task given this day and individual help is provided. Aside from the dots 
exercise, this was the first really difficult task. 

Day a (2 x 3 and 3 way analogies, and matrices criterion task): Two 
by three analogies simply involve an extra pattern in the first row of 
2x2 analogy. However, the children are asked to "draw" the correct 
answer rather than to select from alternatives. The task appeared easier 
than 2x2 described above. There are 12 items in the task. 

The 3x3 analogies or matrices problem consists of four sets of items 
with six alternatives for each set. The task is to select one of the six 
alternatives in order to complete the matrix. « 

In the matrices criterion task, there are eight items. The child is 
asked to draw the correct solution that will complete each matrix. In all 
of these tasks children are required to attain at least 80% mastery and 
are provided help (i.e., hints) in order to derive the correct solution. 



CHAPTER III 
RESULTS: EFFECTS OF TRAINING 
Test Performance and Effects of Training 
Results of the effects of training presented in this chapter are orga- 
nized into four sections. The first section examines the relationships be- 
tween the various tests for Control group students. The second section 
concerns test performance of both Training and Control group.. students on 
the Figure Intersection Test (FIT) as a measure of information processing 
capacity (I.e., M-level), Following the criterion set by Bereiter and 
Scardamalia (1979) results for students who achieve an M-level of zero or 
greater is presented together with an analysis of group comparisons on 
M-level. 

The third section presents results and analysis of training effects 
for all students obtaining a minimum processing capacity of zero. In this 
section item difficulties are presented and training effects are examined 
for the Raven total score, Raven subscales (I.e., published Raven "sets"), 
and theoretical subscales constructed by grouping Hems of the same pro- 
cessing requirements (i.e., M-demand). 

The fourth section focuses on the effects of training when processing 
capacity is taken Into account. In this^section only subjects whose pro- 



cesslng capacity 1s equal to or greater than the processing demands of the 
items are examined. Results are presented for the Raven total score and 
the theoretical subscales. 

Results for students obtaining an M-level greater than or equal to 
zero and for those matched with the processing demands of the test are pre- 
sented so that a complete picture of test performance arid the effects of 
traininwfs obtained. However, since subjects should have the minimum pro* 
cessing capacity for training to be effective In the first place (Case, 
1974), matching subjects 1 processing capacity to processing demands is the 
main focus of this study. 
Relationship Among Tests 

The results of the pre-training tests for the training and control 
groups are.given in Table 3. The pre-training tests include the Childrens 
Embedded Figures Test (CEFT), the Water Level Test (WLT), the Figure Inter- 
section Test (FIT),, and the Cartoon Conservation Scales (CCS). Observation 
of the results indicate that there is little difference between Training 
and Control Groups for Blacks and a slight difference on the CEFT for the 
Hispanics. For the Anglo sample there 1s an apparent trend in favor of the 
control group. In -ordeV\to examine performance on the pre-tests, a post 
hoc analysis was performed N using multiple t-test confidence intervals with 
the Type I error controlled by dividing the alpha level across the four 
comparisons in each race (I.e., .^35/4 .01). The computations for the con- 
fidence intervals were performed acceding to Marascuilo (1971, p. 323). 
The results for the one Hispanic and th^\four Anglo comparisons are summar- 
ized in Table 4. All other means reported in Table 3 are of such small 
magnitude (I.e., of no educational significant), as to not warrant test- 
ing. 
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TABLE 3 

Average Test Scores and Standard Deviations 
for Training and Control Group on the CEFT 
WLT, CCS, FIT and Raven by Group 




CCS 



FIT 



Raven 



X 

SO 

7 
SO 

7 
SO 



19.4 
4.91 

17.1 
7.13 

35.6 
10.22 

61 



19. * 
5.43 

16.8 
9.91 

28 . 2 
10.37 

73 



19.5 
4.02 

19.5 
9.79 



19.6 
5.02 

20.8 
9.65 



34.2 30.0 
12. 13 JO. 50 

33 50 



21.4 

6*63 

22 . I 
9.28 

41.4 

6.56 

34 



23.3 
4.77 

23.5 
8.36 

37.4 
9.21 

40 



TABLE 4 

Multiple Comparisons Between Trtotient and 

Control Groups on Pro-Training Tests 
for Blacks, Hispanics and Anglos 



CoMpar Ison 

(Control -Traatmant ) 






LL 


UL 


Hispanic <df-81) 


*• 








• " CEFT 


2.1 


1.2963 


-1.32 


5.52 


Anglo <df«72> 










CEFT 


Z.Q 


.93001 


-.464 


4.46 


WLT 


2.3 


.98876 


-.320 


4.92 


CCS 


1.9 


1.32922 


-1.62 


5.42 


FIT 


1 .4 


2.03123 


-4.04 


6.84 




T dstsmlnsd for J2L< .01, two tailed, df * N, + N ? -2 



4 

CI »* ±t<SE^ ): for Hispanics tha critical valua for t»2.64 and 
for Anglos t"2.65 



(Marascul lo, 1971 , p. 323) 



0 ' * 
* 
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The results shown in Jable 4 indicate that there are no significant 
differences between Training and Control groups on any of the tests compar- * 
ed. Thus, the randomization procedure for selecting treatment and control 
students was effective. Nevertheless, the consistent trend demonstrated in 
the Anglo' groups should serve as a caveat in later discussions. 

In the following an examination is made of the relationship betweei 
the pre-training tests and the Raven. This includes a comparison* of the 
pattern of correlations among the tests in the control groups for each 
race. 

Intercorrelations Among Tests 

Previous research (e.g.i Bereiter and Scardamal ia , 19791 Case & 
Globerson, 1974) indicates that performance on cognitive style, cognitive 
developmental and analytic intelligence measures are related to a subject's 
' tendancy to use a large central computing space (M-space) in approaching a 
cognitive task. 

Specifically Bereiter and Scardamalia (1979) report a Pearson correla- 
tion of .71 between one version of the FIT and the Raven Progressive 
matrices in an Anglo sample. They conclude that the Raven and the FIT are 
essentially measuring the same construct i.e., information processing capa- 
city, j 

» 

Similarly, Case and Globerson (1974), present empirical evidence in 
support of the notion that disembedding situations (i.e., CEFT) require a 
relatively large amount of central computing space in order to solve the 
task. According to Case and Globerson, a moderate correlation is to be ex- 
pected between jneasu res such as the CEFT, Raven and FIT tests. 

De Avvla and Havassy (1974) and Pascual -Leone (1969) demonstrated that 
performance on neo-Piagetian developmental measures is related to both 

«> 
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Information processing capacity and cognitive style. In particular, \ 

\ 

Pascual-Leone argues that a substantial proportion of the variance on \ 
developmental and cognitive style measures is due to their shared variance 
with Information" processing capacity. * 

Given this information one would expect a pattern or correlations in 
which at least a moderate relationship would be exhibited between all of 
the .measures used in this study; In particular, however, a fairly strong 

V 

correlation would be expected between the FIT and the Raveni 

Since a~major hypothesis of this study 1s that culture-loading may oc- 
cur whenever a test 1s not measuring the same underlying construct in 
diverse groups, and that this 1s a source of bias in tes; performance, 1t 
would be interesting to examine the interrelations^}^ between the tests. 
If the Raven test exhibits a cultural -bias (i.e., 1s culture-loaded) then 
one would expect that the pattern of correlations would not be the same for 
diverse ethnic groups. 

'The Pearson correlation matrix for Black, Hispanic and Anglo control 
group subjects Js shown in Table 5. While it 1s recognized that the three 
groups are not considered comparably, it should be pointed out that whatev- 
er differences existing between the groups is manifest in the pattern of 
correlations and reflects each group's characteristics as they normally 
exist 1n public schools. As such, a comparison of the pattern of correla-* 
tions is meaningful to the extent that 1t reflects such differences. 

The correlations In Table 5 indicate that, the relationship between the' 
tests is similar in that all correlations are significant. However 4 , corre- 
lations with age arp significant in only the Anglo group. Moreover, the 
pattern of correlations, as well as the magnitude, 1s somewhat different. 
In particular, ^the correlation between the FIT and Raven for the Anglo 
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Table 5 

\ ' 

Pearson Corral at I on C ©efficients among Age, CEFT , 
WLT, CCS, FIT and Raven Tests 4or Black/ 
Hispanic and Anglo Cbntrol Students 

CONTROL GROUP STUOENTS 



| AGE | CEFT | WLT | CCS | FIT 


Black (n«73. If r > .19, p <^ .05) 


CEFT 


.12 










WLT 


. 18 


.31 








CCS 


.05 


.20 


.33 






FIT 


.12 


.56 


.47 


.27 




RAVEN 


.12 


.43 


.43 


.56 


.48 


H I s pan 1 


c ><n«50. 


If r j> .23, p < 


.05) 




CEFT 


.54 










WLT 


.03 


.27 








CCS 


.16 


.30 


.27 






FIT 


.01 


.30 


.43 


.42 


. / \ 


RAVEN 


.06 


.33 


.37 


.55 


.49 


Ang lo (n«40. If 


r .26, 


p < .05) 


CEFT 


.37 . 










WLT 


.21 


.54 








CCS . 


.40 


.58 


.66 






FIT I 


.36 


247 


.56 


.54 




RAVEN 


.32 


^ .60 


.57 


.57 


.71 



1 



S3 



\ 
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group is si.gnifically different than that for the Black and Hispanic groups 
.(p<.05). This resuU suggests that the Raven (or the FIT) may be measuring 
something different In the minority groups. This is examined in the fol- 
lowing section. 
Factor Analysis 

One way to examine the pattern of correlations is through factor anal- 

j 

ysis. By reducing the number of variables to a smaller set, one may exam- 
ine the interrelationship between the tests and infer the source of the 
variance accounting for the observed interrelations in the data. As was 
suggested in the above discussion, it is expected that there is a common 
source of variance underlying performance on. all the tests. In addition, 
it is hypothesized that there is an additional source of variability not 
really related to what the test is intended to measure. This additional 
source, or factor, is hypothesized to contribute to the culture-loading in 
a test. 

Jensen (1980) demonstrated the utility of a factor analytic approach ' 
in examinations of test^bias. ^difference exists here* however, in that a 
theoretical rationale has been provided which suggests that the test admin- 
istered would be applicable to a factor analytic approach. That is, there 
is an underlying communal ity In test performance due to Information pro- 
cessing capacity and an Intervening or common extraneous source due to the 
.processing strategy applied; One would expect then that an additional , or 
culture-loading factor, would emerge for the minority groups and that tests 
susceptable to this "bias factor" would load appreciably on this factor. - 

The wettotLflf factor analysis applied is the principal factor solution 
with varimax rotation. This method was selected because it replaces the 
main diagonal elements of the correlation matrix with communal Ity estimates 
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and thus automatically produces so-called Inferred factors. Table 6 shows 
/ the results of the factor analysis for control group students in each 
ethnic group. <> 

. Table 6 shows the first principal factor (unrotated) for each ethnic 
group arid the rotated factor matrix. For the Black and Hi spani.c groups two 
factors were extracted and rotated. The Anglo group, however,, revealed 
only the first principal component and thus no rotation was necessary. 
Table 6 also shows the communalities for each variable (1 .e. , the' total 
proportion of variance in each variable accounted for by the factors). 

The mere fact that the number of factors extracted for the minority 
groups differs from that of the Anglo group indicates that something dif- 
ferent is measured in the combined set of variables. The discussion pro- 
vided above suggests that a common source of the variance In the set of 
variables is due to Information processing capacity. Thus we would expect 
that at' least one of the factors would represent this construct. 

It 1s fairly clear that for theAnjlj^wup-th^ 
repre^entsjafo^ capacity. All of the variables except 

age load heavily on this factor. It is noted, too, that age, which is also 
correlated with information processing capacity (i.e., it Is a developmen- 
tal variable), has a restricted variance due to the nature of the sample, 
i.e., fourth and fifth graders. Thus, the main source of variability, 
then, for the Anglo grqup is exactly what would be expected fromthe bat- 
tery of tests given. 

The minority group factor analyses are not so clear 1n terms of label- 
ing the factors. It was expected that a factor- representing processing 
strategy would emerge. The unrotated first principal component for each of 
the minority groups differs from that of the Anglo group in that the load- 
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„ , TABLE 6 

Principle-Factor Solution, Factor Analysis with Varlmax 
Rotation of CEFT, WLT, CCS, FIT, Raven and Age for 
Black, Hispanic and Ang to •C<fn*r© I Groups. 



BLACKS 


(Number of Factors 


«2> 










Unrotated 


Rotated 


F ac to r s 








Fa.ctpr 










Variables 


Factor 1 


Fac tor 




H 2 


CCS 


.53207 


•19491 


*; o ok 




TRQfiRfi 
• j o y w o w 


V WLT 


* .59115 


•41239 


• 4 L. I f J 




T SR 01 


CEFT 


.57317 


.60677 


.17051 




, .39724 


FIT 


•78196 


.90563 


•14056 




•84028 


Raven 


.78366 


.43776 


•69558 




.67548 


Age 


.33576 


•02296 


•49280 




.24289 * 


HISPANICS (Number of Factors -2) 


- 




• 




Unrotated 


Rotated 


Factor s ' 




^onimuna i i'Y 




1st Factor 














Factor 1 


r acTor 




u2 
n 


CCS 


t 70732 


.73391 


1 X A CIO 
• 1 JO JO 




• 33 / OJ 


WLT 


.55202 


.46163 


O ift A 1 X 






CEFT 


.56588 


•36044 


•51260 




.39288 


FIT 


•65396 


.63773 


•20049 




•44669 


Raven 


.71721 


.79232 


.05516 




.63082 


Age 


.47371 


•04670 


.8 7786 




.77302 


ANGLOS 


(Number of Factors 


- 1 ) 










' Unrotated 








Communal ity 




1st Factor 








■ , H 2 - 


CCS 


.80164 








•64263 


WLT 


.73386 








.53854 


CEFT 


.71676 








.51374 


FIT 


" .76227 








.58105 


Raven 


.61300 








.66097 


Age 


.32175 








.10352 



ings are lower due to the additional factor* .Moreover, the communal i ties 
indicate that, in general, a greater proportion of the variance in each - 
variable is accounted for by a single factor in the Anglo group than is . 
explained with two factors in the minority group. 
. Tfie resul ts^>f^he rotated factor matrix show that the Raven test is 
loaded on both factors in the Black sample but on only one factor in the 
Hispanic sample. This suggests that the nature of culture-loading hypoth- 

fr. 

esized in this study is not reflected as much with the Hispanic group, at 
least with the set of variables included in the analysis. Thus, while a 
second factor emerged; the Raven test did not load appreciably on this fac- 
tor, and consideration of the results for Blacks and Hispanics must be con- 
sidered separately. 

A possible interpretation of the results for Blacks is that the first 
rotated factor represents a processing strategy factor while the second 
represents an analytic ability or processing capacity factor. The ration- 
ale for this interpretation is the fact that the, FIT and Raven are loaded 
on different factors. Moreover, the CEFT also loads more heavily on the 
factor defined primarily by the FIT. 

The theoretical discussion provided above suggested that the set of 
tests have information processing capacity in common. ' However, the tests 
also have a cognitive style factor in common. Cognitive style is known to 
effect performance on information processing tasks and analytic tasks which 
require a disembedding solution (Case & Globerson, 1974). Thus, the inter- 
pretation of the Hrst factor as a processing strategy factor is consistent 
with this expectation. Additionally, the^CCS also loads primarily oh the 
2nd factor defined by the Raven. The one curious result is the rather 
moderate loading of the FIT on factors defined by the Raven. 



The results- for the Hispanics are clearer. For example, the first ro- 
tated factor can be identified as an analytic or information processing 
factor. The second factor is defined by age and the CEFT. The WLT and the 
FIT load only moderately on this factor. The important thing to note is 
that; wljile, there is an additional factor associated With processing /strate- 
gy, the Raven did not load on this factor. Nevertheless, cognitive" style 
did not show the same relationship in the battery of tests. This indicates 
that it is a source of variance between subjects (i.e., age groups) but is 
not necessarily related to performance on the Raven as was expected. 

The factor analyses indicate that the main difference in patterns of 
correlations is age related. The correlation 7 matrices indicated this and 
the factor analysis demonstrated it. In general, it appears that the Black 
and Anglo group differ most in terms of variables related to Raven test 
performance while the Hispanics are somewhat similar to the Anglo group. 
Two factors did emerge for the minority groups, and only one for the 
Anglos. 

In the following, the results of the FIT test as a measure of informa- 
tion processing level (M-level) and the effect of training on Raven test 
performance are presented. Following this is an item analysis of the Raven 
and an examination of the cOttjre-loading hypothesis. 
Information Processing Capacity 

Information processing capacity or M-level is defined as the number of 
discrete pieces *t information that can be processed simultaneously. The 
set measure of M-level" is the Figure Intersection Test (FIT, Pascual- 
Leone, 1969). Scoring for the FIT to obtain a subject's M-Level followed 
the procedures described by Bereiter and Scardamalia (1979). In this -pro- 
cedure the percentage of correct responses on each FIT subscale are summed , 



and a constant of 1.5 1s subtracted from the total. The result is the sub- 
ject's "M-Tevel." 

Some subjects will obtain an M-level less than zero when this proce- 
dure is used. Subjects obtaining M-levels less than zero are thought to 
have done so because of tnattenti vness /or failure to grasp the 'task in- 
structions (Bereiter & Scardamalla, 1979). Consequently, following the 
precedent established by Bereiter and Scaradamal ia, such subjects are 
dropped frgrs the analysis. The results of M-level assignments are given in 
Table 7, together with the number of subjects obtaining an M-level of less 
than one for each group. 

.Table 7 shows the rfumber and percentage of students obtaining a given 
M-level greater than zero on the FIT. The results are roughly equivalent 
to what would be expected for subjects of this age group (Case, 1972). The 
mean M-level rank, average M-level, standard deviation and median M-level 
for Black, Hispanic and Anglo subjects are shown at the bottom of the 
Table. 

In order to test for group differences 1n the distribution of M-levels, 
a Kruskal-Wallis one-way analysis of variance on the ranks (following the 
cell -means procedure described by Marascuilo and Levin (1976) was perform- 
ed on the ranks with planned comparisons on selected groups. The 
eel 1 -means mo*del of analysis allows for tests of hypotheses normally 
associated with either a nested analysis (i.e., between treatments within 
race) or a fully-crossed analysis, (i.e., interaction). The cell -means 
model is basically a one-way analysis of variance in which each group is 
treated as a single block. This results 1n six groups defined as follows: 
one group each for Black, Hispanic and Anglo Training groups and one each 
for Black, Hispanic, and Anglo Control groups. 



Table 7 



M-level a Distribution by Treatment Group for 
Black, Hispanic and .Anglo Groups 





Black 


H 1 span i c 




Ang 


lo 




M-level 


Tra 1 n 


Control 


Tra in 


Contro 1 


Train 


Contro 1 




N 


t 


N 


t 


N 


t 


N 


t 


N 


t 


N 


t 


o 


6 


10.3 


13 


20.6 


3 


10.3 


1 


2.3 


1 


3.2 


1 


2.6 


1 


13 


22.4 


7 


11.1 


2 


6.9 


5 


11.4 


2 


6.5 


4 


10.5 


2 


16 


27.6 


9 


14.3 


4 


13.8 


7 


15.9 




9.7 


1 


2.6 


3 


12 


20.7 


13 


20.6 


6 


20.7 


6 


13.6 


7 


22.6 


7 


18.4 


4 


7 


12.1 


8 


12.7 


7 


24. 1 


12 


27.3 


6 


19.4 


11 


28.9 


5 


4 


6.9 


12 


J9.0 


7 


24. 1 


9 


20.5 


9 


29.0 


12 


31.6 


6 

* 


0 


0 


1 


1.6 


0 


0 


4 


9. 1 


3 


9. 1 


2 


5.3 


N 


58 




63 




29 




44 




31 




38 




Mear Rank 


169.16 


126.24 


100.48 


150.29 


1 1 1.88 


98.38 


M < 0 
N 


3 




10 




4 




6 




3 




2 




Mean 

SO 
Median 

. N 


2.41 
1.636 

121 


3.36 
1.602 

73 


3. 
1.' 

c69 


75 
199 



8 M-level « ,P| - 1.5, Pj ■ Percent Items correct on FIT subscales 

(Berelter & Scardamalla, 1979) 
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Nine contrasts were computed for comparisons between Training and Con- 
trol groups in each ethnic gr|>up (3), between Black-Anglo and Hispanic- 
Anglo within Training and Control groups (4), and two interaction contrasts 
comparing the differences between the Training and Control groups for Black 
and Hispanic with the difference between the Anglo Training Control groups. 
The planned contrasts were computed according to the procedures described 
in Marascuilo and McSweeney (1977). Since both the full nested analysis 
and the fully crossed analysis allow for an overall .15 type I .error rate 
(alpha), this error rate was distributed across the nine contrasts using 
probalities obtained from Dunn's (1961) table of critical values. With 
this procedure each contrast is tested at an alpha level of .0167. Table 8 
presents the results of the Kruskal-Wallis analysis on the ranks. 

Of the nine contrasts shown in Table 8, two are significant. These 
involved the comparisons between Black vs. Anglo Training and Control 
groups. Direct interpretation is difficult because of the confounding of 
school attended, socio-economic status and male-femall distributions in 
each ethnic group. In addition, the factor for the Black students in the 
sample suggests that the FIT test may not be measuring the same thing in 
each group. The comparisons were performed to examjne the distributions of 
the samples in the study and are not amenable to generalizations beyond 
this purpose. 

Analysis of the Raven Test 

* There are sixty items in the Raven test. The items are grouped Into 
five subscales of 12 Items each. Each Item within a subscale becomes pro- 
gressively more difficult as does each successive subscale. The subscales 
are also dependent upon different cognitive processes. That 1s, different 
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Table 8 



Kruskal-Wal I Is a Priori Contrasts 8 
between Selected Palrwise Groups on M-Level 



Comparison 


y , 




LL 


UL 


Y j : Bl ack Tra In I ng 
... vs. Control 


16.869 


a <* £ 1 ft £ 

13. 6386 


- 1 3 • 72 72 


«i 1 A A «i 9 


* 2 : HI span Ic" Tra 1 n 1 ng 
vs. Control 


14.366 


1 7 .9266 


-28 .4780 


1 UO 


* 3 : Ang lo Tra 1 n 1 ng 
vs. Control 


2. 102 

- a 


1 8. 139 1 


-4 1 .238 9 


J * HOI 
4 3. 442 P 


^ • DiacK training 

vs. Anq lo Tra 1 n 1 ng 


68 671 

WO . V r 1 


16.6749 


28.8180 


108.5240* 


* 5 : 1 Hispanic Training 

vs • Anq lo Tret 1 n 1 ng 


25.757 


19.3624 

* 


-20.5191 


72.0331 


Black Control vs. 
Anglo Control 


51 .618 


15.3944 


14.8254 


88.4106* 


* 7 : HI span Ic Control 
vs. Anqlo Control 


13.493 


16.5978 


-26.1758 


53. 1618 


H*e : ^ | - Y 3 (Interact Ion) 


16.767 


22.6948 


-37.4336 


71 .0076 


Y 2 "*3 (Interaction) 


12.264 


25.5027 


-48.6875 


73.2155 



a Corrected for tie values, t 2.9708 (Marascullo & McS*eeney, 1977) 
b TDunn % - 2.39, Q - 9, total ot . .15 
* Statistically significant a« .017 



cognitive processess may effect the complexity of the Item 1n terms of the 
amount of information that must be processed in* order to correctly solve 
the Item. For example, the easier subscales can be solved by a global or 
visual processing strategy, while later subscales ^e dependent more on an 
analytic processing strategy. According to Bereiter and Scardamalia 
(1979), there are at least three factors which effect the difficulty level 
of Raven items. The three strategies are summarized in the following: 

1. Analytic Strategy : Three types pf problems are identified which 
are a function of whether the item type involves a) pattern repe- 
tition, b) elements permuted, or 3) progressions. 

2. Perceptual Factor : Influences the difficulty level of an item in 
an undetermined way; the item may be made easier or more difficult 
depending on the nature of the perceptual factor and that of the 
item. Perceptual factors are grouped under a) Gestalt effects and 
b) embedding of figures. 

3. "Copy Strategy :" This 1s simply a means by which subjects, compare 
\ consecutive figures to determine the closest match. This strategy 

was identified on the basis of subjects 1 eye movements.* 
The theoretical discussion provided in Chapter I leads to the predjc- 
tion that training would have a greater effect on the more difficult items 
(i.e., those involving more complexity) and on those items involving per- 
ceptual factors. It is hypothesised that non-trained minority subjects 

tend to respond more to perceptual factors then do their Anglo counter- 

v 

parts, for complex tasks, a perceptual processing strategy would' require 

\ * 

more processing capacity on the part of the subject in order to overcome 
the distraction caused by the misleading but salient perceptual factors. 
For this reason, it is hypothesized that many students perform poorly on 



analytic and disembedding processing tasks. The question thus becomes one 

» 

of removing these differences so that assessment of thevdesired character- 
istics can be made. The research questions to be examined in this part of 
the study are summarized 1n the following: 

H 0 : Training will have a positive effect on test performance for 

each ethnic group; however 9 the change (i/e. gains) will be ■ •' 
-greater for minority subjects. 
H 0 : Training will have its greatest effect on the most difficult 

subscales of the Raven (difficulty based pn percent passing). 
H 0 : Training will have its' greatest effect on the most complex 
Raven test Hems (complexity defined in terms of Hem 
processing demands). * 
H 0 : Raven test Hems Identified, a-priori, as culture-loaded will 
v show a greater effect due to training than items nut 

identified as culture loaded. That is, training will have the 
greatest effect on the items identified as culture-loaded. 

Results 

Percentage passing each Raven Item, average subscale and Raven total 
score are provided in Table 9 for each ethnic group. There are 216 means 
provided in Table 9. Of Initial Interest are Raven total ..test and mean 
subscale scores. From fable 9 it can be seen that'the average difference 
between all subjects in the training and control groups for Blacks, 
Hispanics, and Anglos 1s 7.4, 4.2 and 3.9 repectively for the Raven total. 
These results represent the performance of all subjects regardless of 
M-level. The analysis reported below, however, is restricted to perfor- 
mance of subjects obtaining an M-level of at least zero for reasons discus- 
sed earlier, so that the results will be consistent with the Hem analysis 
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TABLE 9 

Item Difficulties (Percent Passing) on Raven 
Te^t Items for Black, Hispanic, and Anglo Contra! 
and Training Groups 



I t ern 

R1 
R2 
R3 - 
R4 
R5 
R6 
R7 ' 
R8 
R9 
RIO 
R1 r 
R12 
A SUB 
R13 
R14 
R15 
R16 
R17 
R18 
R19 
R20 
R21 
R22 
R23 
R2 4 
BSUB 
R25 
R26 
R27 
R28 
R29 
R30 
R31 
R32 
R33 
R34 
R35 
R36 
CSUB 
^R3<\ 
R36 
R39 
R40 
R4 1 
R42 
843 
R44 
R45 
R46 
R4 7 
R4 8 
DSUB 
R49 
R50 
R51 
R52 
R53 
R54 
R55 
R56 
R37 
R?8 
R59 
R60 
ESUB 
Raven 
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Co* 

Percent" 
Pass I ng 
1 .0000 
1 .0000 
0.9726 
1.0000 
0.9041 1 

• 0.9.041 
0.6164 

'.vd.8219 , 

- 0.7260 
0.3890 
0.41 10 
0. 1781 
9. 1370 
0.9863 
0.8904 
0.7671 
0.6575 
0.6027 
0.4521 
0.47.47 
0.41 10 
0.4384 
0.5205 
0.5342 
0.2055 
6.8767 
0.7945 
0.7123 
0.6712 
0.5753 
0^5479 
0.4521 
0.3286 
0.3151 
0.5068 
0.2192 
0. 1370 
0.1096 
5.3699 

* 0.8082 
0.6301 
0.5479 
0.6164 

- 0.6436 
0.5205 
0.3699 
0,5068 
0^51 

0.1096 

0.0548^ 

5.3836 

0.2740 

0.2877 

gstqtm* — 

0.M370 
0.0822 
0.0959 
0. 1233 
0.0548 
♦ 0.0548 
0.0548 
,0.0959 
0.0274 
I • 5205 
28.1507 

(N-73) 
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Black Students 



trol 

"Standard 
Dev I at I on 
0.0 
0.0 

• 1644 
0\0 

0.\2963 
0.2,9 6? 
0.4896 
0.38 x 52 
0.44&1 
9.4954 
0.4954\ 
0.3852 \ 
1.9601 \ 
0.1170 \ 
0.3140 \ 
0.425* \ 
0.4778 
0.4927 
0..50 1 1 
0..4977 
0.4954 
0.4996 
0.5030 
$0. 5023 
0.406a 
3.3827 
0.4068 
0.4558 
0.4730 
.4977 

• 501 1 
50 1 I 

014730 
0.V4678 
0.5034 
0.4l 66 
0.3M62 
0.5145 
2.A557 
0/3964 
0.4861 
0.501 1 
0.4896 
0.4822 , 
0.5030 
0.4861 
0.5034 
0.4678 
0.4558 
0.3145 
0.2292 
3.2814 
0.4491 
0.4558 
n0.4166 . 
0.3462 
0.2766 
0.2965 
0.3310 
0.2292 
0.2292 
0.2292 
0.2963 
0. 1644 . 
1.4153 
10.3664 




Tra I n I ng 

Percent 

Pass I ng 

0.9836 

1.0000 

1.0000 

1.0000 

0.9836 

1.0000 

0.8033 

0.8325 , 

0.90I6 

0.7049 

0.5738 

0.3934 
10.1803 

0.9836 

0.9508 

0.8197 

0.8689 

0.6066 

0.6066 

0.6393 

0.6230 

0.6885 

0.8033 

0.6885 

0.3770 

8.6557 

0.8852 

0.7049 

0.8033 

0.7049 

0.7213 

0.63?3 

0.7377 

0.5246 

0.7541 

0.2623 
"0.1311 

0.0328 

6.916 

0.9344 

0.7705 

0.7213 

0.7377 

0.8361 

0.6557 

0.4426 

0.6230 

0.3934 

0.5082 

0.1211 

0.0656 

6.8197 

0.5574 

0.5246 

0.4098 

0.2951 

0.3443 

0.2295 

0. 131 1 

0. 1639 

0.1475 

0..0492 

0.0328 

0.0656 

2.9508 
35.5738 



Standard 
Dev I at Ion 
0. 1280 
0.0 
0.0 

0.0 > 
0. 1280 
0.0 

0.4088 
0.3576 
0.3003 
0.4599 
0.4986 
0.4926 
1 .4777 
0.1280 
0.2180 
0.3877 
0.3404 
0.4926 
0.4926 
0.4842 
0.4887 
0.4669 
0.4008 
0.4669 
0.4887 
2.8803 
0.3214 
0.4599 
0.4008 
0.4599 
0.452 I 
0.4842 
0.4435 
0.5035 
0.4342 
0.4435 
0.3404 
0. 1786 
2.7185 
0.2496 
0.4240 
0.4521 
0.4435 
0.3733 
0.4791 
0.5008 
0.4887 
0.4926 
0.5041 
0.3404 
0.2496 
2.84 90 
0.5008 
0.5035 . 
0.4959 
0.4599 
0.4791 
0.4240 
0.3404 
0.3733 
0.3576 
0.2180 
0 6 1796 
0.2496 
2.4928 
10.2233 



(N-61 ) 
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Item Passing 

R1 1.0000 

R2 4 1 .0000 

R3 1.0000 

R4 0.980Gf 

R5 0.9800 

R6 0.9400 

R7 0.6600 

R8 0.7600 

R9 0.8400 
RIO - 0.7600 

R11 0.6200 

R12 0.4200 

ASUB 9.9600 

R13 0.9800 

R14 0.9600 

R15 0.7400, 

R16 0.680C 

R17 0.5600 

R18 0.5000 

R19 0.4200 

R20 0.4400 

R21 0.5000 

R22 0.6200 

R23 0.4200 

R24 0.2400 

BSUB 7.0600 

R25 0.8600 

R26 0.7800 

R27 0.7400 

R28 0.5600 

R29 0.6400 

R30 0.4000 

R31 0.5600 

R32 0.4000 

R33 0.4800 

R34 0.1600*. 

R35 0.1000, 

R36 0.1400 

CSUB 5.8200 

R37 0.8400 

R38 0.6000 

R39 0.5400 

R40 0.5600 

R41 0.7400 
R42 r 0.5000 

R43 0.4000 

R44 €.4600 

R4 5 6.26j00~ 
R46 » 0.2000 

R47 0.1200 

R48 0.0600 

DSUB 5.2800 

R49 0.5000 

R50 0.3400 

R51 0.2800 

R52 0.1800 

R53 0.2200 

R54 0.1400 

R55 0.1000 

R56 0.1200 

R57 0.0400 

R58 0.0600 

R59 0.0600 

R60 0.0800 

CSUB 2.1200 

Raven 30.0400 

(N-30) 



TABLE 9 (Continued) 
Item Difficulties (Perient Passing) on Raven 
Test Items for Black, Hispanic, and Arvglo Control 
\ • and Trailing Groups" 

\ Hispanic Students 
Contro I \ 

Percent Standard s t 

Dc^ I at' I on Item 

-3.0\ RJ 
0.0\ *' R2 
v 0.0 \ . ,R3 

0.1414 „ R4 

0. 1414 R5 
0.239* ■ R6 
0.4785\ ' R7 

0.4314\ HB 

0.3703 \ R9; 

1 0.431.4 \ Rip * 

0.4903 \ R 111 

0.4986 R12 

1.9268 ASUB 

,0.1414- R1;3 
0. 1979 *' R\4 

0.4431 Rt5 

- 0.4712 R16 

0.5014 R17 

0.5051 R10 

0.4986 R19 

0.5014 R20 

0.5051 . R21 

0.4903 R22 

. 0.4986 R23 

0.4314 R24 , 

3.0932 BSUB 

0.3505 R25 ' 
^0.4185 \ R26 

0.4431 R27 

0.5014 R28 

0.4849 R29 

0.4949 * R30 rf 

0.5014 R31 

0.4949 ♦ R32 
0.5047 , R33 

0.3703;. R34 
0.3030 ■ « R35 

0.3505 R36 

2.8692 CSUB 

0.3703 'R37 

0.4949 R38 * 

0.5035 R39 

' 0.5014 R40 

0r4431 R41 

0.5051 R42 

0.4.949 R43 

4 0.5035 R44 

0.4431 . R45 

0.4041 R46 

0.3283 . R47 

0.2399 ■ R48 

2.9969 DSUB 

0.5051 R49 

0.4785 R50 

0.4536 R51 

0.3661 R52 

0.4183 R53 

0.3505 R54 

0.3030 R55 

0.3283 R56 

0.1979 R57 

0.2399 R58 

0.2399 t, . R59 

0.27*0 R60 

2".0368 ESUB 

10.5016 Raven 
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Training 
Percent*' 
Pass I ng 
1.0000 
1.0000 
-0.9394 
"0.9394 
0.9697 • 
0.9697' 
0.6'9t0 
0.6364 
0.9394 
0.878# 
0.6061 
0.3939 
9.9697 
1.0000 
1.0000 

0.8182 1 
0.6970 
0.7576 - 
0.545l5 ' 
0.4545 
0.5455 
0.5758 
0.6667 
0.4846 
0.3030 
7.8485 
0.8788 
0.J 2 73 
0.6970 
0.5455 
0.6970 
0.4848 
0.7879 
0.5152 
0.7576 
0.3030 
0.2727 
0.0606 
6.727,3 
0.9394 
0.7273 
0.7576 
0.51 52 
0.7576 
0.5758 
0.6061 
0.4545 
0.2727 
0.4848 
0.1515 
0.0606 

• 6.3030 
0.5455 * 

* 0.606 I 
0.6061 
D.4545 
0.4242 
0.2727 
0.0909 
0.1618 
0.09Q9 
0.0606 
0.0303 
0.0606 
3.4242 

34.2424 

(N-33) 



Standard 
Dev I at I on 
0.0 
0.0 

0.2423 

0.2423 

0.1741 

0. 1 74 

0.4667 

0.4885 • 

0.2423 

0.3314 

0.4962 

0.4962 

2.2289- 

0.0 

0.3917 

0.4667 

C.4352 

0.5056 

0.5056 

0.5056< 

0.5019 

0.4787 

0.5075. 

0.4667 

3.4652 

0.3314 

0r4 52* 

0.4667 
0.5056 
0.4667 
0.5075 
0.4151 
0.5075 „ 
0.4352 
0.4667 
0.4523 
0.2423 
2.8093 
0.2423 
0.4523 
0.43^2 
0.5075 
0.4352 
0.5019 
0.4962 
0.5056 
0.4523 
0.5075 
0.3641 
0.2423 
3.1373 
0.5056 
0.4962 - 
0.4962 
0.5056 
0.5019 
0.4523 
0.2919 
0*39 17 
0.2919 
0.2423 
0.1741 
0.2423 
2.6579 
12.1399 
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TABLE 9 (Continued) 
Item Difficulties (Percent Passing) on Raven 
Test Items for Black, Hispanic, and Anglo Control 
5, and Training Groups 



Ang lo Students 

Control 

Percent Standard 6 * 

Item Passing Deviation Item 

R1 1.0000. 0.0 R1 

R2 WOO00 0.Q R2 

R3 . d.9750 0. 1581 R3 

R4 0U9750 0.1581 R4 

R5 * y.9500 0.2207 R5 

R6 ^.9 750 fcl581 R6 

R7 JO. 8750 0.3349 R7 

#® ^/ 0.8 750 0.3349 R8 

R9 0.9000 0.3038 R9 

R10 0.8000 . 0.4051 R10 

R11 0.6500 0.4830 ° R11 

R12 '0.4750 0.5057 R12 

ASUB 10.6250 1.5801 ASUB 

R13 1.0000 0.0 © R13 

R14 V 0.9500 0.2207 ft R14 

R15 0.9500 0.2207 R15 

R16 0.8750 0.3349 R16 

B,17> 0.8250 0.3848 R17 

R18 0.7750 0.4229 R18 

R19 0.7500 0.4365 R19 

R20 0.5250 0.5057 R20 

R21 0.6500 0.4830 R21 

R22 * 0.7000 0.4641 R22 

R23 0.6750 0.4743 R23 

R24 0.4250 0.5006 R24 

"BSUB 9.0750 2.7955 BSUB 

R25 0.9000 0.3038 R25 

R26 0.9250 0.2667 R26 

R27 0.8000 0.4051 R27 

R28 < 0.8000 0.4051 R28 

R29 0.8000 « 0.4051 R29 

R30 0.6750 0.4743 R30 

R31 0.6500 0.4830 R31 

R32 '0.6000 0.4961 R32 

R33 0.6500 0.4830 R33 

R34 0.2750 0.4522 , R34 

R35 0. 1500 0.3616 % M R35 

R36 0.0750 0.2667 " R36 

CSUB .7.2750 2.6697 CSUB 

R37 0.9500 0.2207 R37 

\f*38 * 0.8750 0.3349 *< • R38 

ft39 0.8000 0.4051 R39 

R40 0.8500 0.3616 R40 

R\41 0.7750 0.4229 R41 

R42 0.7750 0.4229 R42 

R43 * 0.6500 0.4830 R43 

R44 0.5750 0.5006 R44 

R45 ■. 0.4750 0.5057 R45 

R46 0.5500 0.5058 . R46 

R47 0.1 500 0.3616 R47 

R48 .. 0.0 0.0 R48 

DSUB 7.4250 2.5809 DSUB 

R49 0.6000 0.4961 R49 

R50 0 1 .5250 0.5057 R50 

R51 0.2250 0.4229 R51 

R52 0.4Q00 0.4961 R52 

R53 0.2250 0.4229 R53 

R54 v 0.2250 0.4229 R54 

R55 - 0.2750 0.4522 R55 

R56 0.1500 0.3616 R56 

ft57 0.0 0.0 R57 

R58 0.1000 0.3038 R58 

R59 0.0500 0.2207 ,R59 

R60 0.0500 0.2207 R60 

JESUB V 3.0250 2, 1302 ESUB 

Raven 37.3750 9.2662 Raven 

(N-40) 



Tra I nl 

Percent 

Pass I ng 
1.0000 
1.0000 
, 1.0000 

1 .0000 

1 .0000 

1 .0000 

0.8529 

0.9412 

1 .0000 

0.8235 

0.6765 

0.4706 
10.7647 

0.9706 

0.9706 

0.8824 

0.7941 

0.6765 

0.6176 

0.5882 

0.7941 

0 fi 7647 

0.7941 

0.7059 

0.4412 

9.0000 

0.9706 

0.8529 

0.9706 

0.7941 

0.8824 

0.7941 

0.9118 

0.7353 

0.9118 

0.5294 

0.4706 

0.0882 

8.9118 

0.9412 

0.9706 

0.9412 

0.8529 

1 .0000 

0.8529 

0.6471, 

0.7059 
, 0.4118 

0.7941 

0.2353 

0.0588 

8.4412 

0.5588 

0.6471 

0.5588 

0.4 706 

0.5000 

0.3624 

0.3529 
• 0.2647 

0.2059 

0.2353 

0.0 

0.0882 
4.2647 
41 .3529 

. (N« 



na 

Standard 

Dev I at Ion 

0.0 

0.0 

0.0 

0.0 

0.0 

0.0 

0.3595 
0.2388 
0.0 

0.3870 

0.4749 

0.A5066 

1 .\1 297 

0.1715 

0. 1715 

0.3270 

0.4104 

0.4749 

0.4933 

0.4.996 

0.4104 

0.4306 

0.4104 

0.4625 

0.5040 

3.0050 

0.1715 

0.3595 

0.1715 

0.4104 

0.3270 

0.4104 

0.2879 

0.4478 

0.2879 . 

0.5066 

0.5066 

0.2879 

1 .9285 

0.2388 

0.1715 

0.2388 

0.3595 

0.0 

0.3595 

0.4851 

0.4625 m 

0.4996 

0.4104 

0.4306 

0.2388 

1 .7441 

0.5040 

0.4851 

0.5040 

0.5066 

0.5075 

0.4933 

0.4851 

0.4478 

0.4104 o 

0.4306 

0.0 

0.2879 
2.6089 
6.5592 

•34) 



/of the Raven for culture-loading. > The analysis of Raven total test perfor- 
mance is examined. Second, analysis is made of Raven subscale performance 
according to those defined in the Raven test manual. Finally, analysis is 

\ ' 

■ \ ' 

made of theoretically constructed subscales grouped according to item 
M-demands. 

Analysis of this data was performed using the Cell-means model with 
planned comparisons described by Marascullo and Levin (1976). As noted 
above the Cel I -means model was selected because It allows \for analysis of 
both the research questions concerning^ withln-group training differences 
(nested analysis) and a comparison of the gains between groups (interaction 
analysis) without confounding the Type I error rate^ , The analysis is 
essentially a one-way analysis of variance of the si^ groups 1n the study. 
One-tall planned'contrasts are then computed for selected comparisons with 
the total alpha associated with the full factorial design (i.e., treatment 
by group: 2x3) distributed accross the planned comparisons. 

The alpha level for a full factorial analysis is .15. This represents 
.05 for each of the two main-effects of treatment and ethnic group, and\.05 

fortheir interaction. For the analysis five contrasts are made. These 

i 

include three wi thin-group training effects contrasts (i.e., nested analy- 
sis of training vs. control within race) and two interaction contrasts com- 
paring training effects for Black and Hispanic groups with the Anglo 
groups. The within group contrasts were computed with an alpha level of 
.10, using critical Jt-values such that each contrasts tested at alpha 
equal to .10/3=.03. The two Interaction contrasts we^e computed with alpha 
controlled at .05 (the level allowed in a fully crossed analysis) so that 
each comparison is made at .025. The total experiment-wide alpha is .15 in 



agreement with a fully crossed two-way analysis of variance. The standard 



Tor term is obtained from the one-way analysis. 

In the following section the results for the total Raven test are pre- 
sented first, followed by an analysis of Raven subscales and then the theo- 
retically constructed subscales. 
Effect of Training on Raven 

Average test performance on the Raven total and subscales for each 
group are given in Table 10. The results of the Cell-mean^analysis of the 
Raven total score is reported in Table 11. 

The results shown in Table 11 indicate a significant difference in 
favor of the treatment groups for Black and Hispanic students, but not for 
Anglq students. At the same time, however, the significant training effect 
for minority students was not significantly greater than the difference in 
the Anglo group (i.e., no interaction effect). 

These results do not support the hypothesis that training would be ef- 
fective f6r all groups, nor do they support the hypothesis that the train- 
ing effect would be greater for minority students than for Anglo students. 
On the other hand, partial support is obtained indirectly in that a signif- 
icant training effect occurred for minority students but not for Anglo stu- 
dents. Stated in this way, training was effective for minority students 
but not for Anglo students on the Raven test overall. 

The same hypotheses were tested for the Raven Subscales. The results 
of the analysis is shown in Table 12. 

Significant training effects occurred for the Black training group on 
all subscales with the exception of subscale D. Hispanics showed a signif- 
icant difference in test performance on subscale E, while the Anglo train- 
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Table 10 

Average Score and Standard Dev latlon of Black, Hispanic and Anglo 
Students with an M-tevel of at Least. Zero on the Raven Total and Subscales 





1 ■ Bl 
Training 


ack 

Control 


Hlsp 
Tra 1 n 1 nq 


anlc 

Control 


An 

Training 


gjo 

Control 


Subsca 1 e m 
A 


X 

SO 


10.2 
1.50 


9.2 
2.03 


10.5 
1.60 


10.2 
1.83 


10.8 \ 
1. 13 


10.7 
1 .43 


Subsca 1 e 
B 


X 

SO 


8.7 
2.88 


7.1 
3.44 


B.4 t 
3. 18 


7.4 

3.09 


9.2 
2.86 


9.4 

2.52 

V 


Subsca 1 e 
C 


X 

o u 


6.9 
2.78 


5.5 
2.99 


7.0 
2.60 


6.1 
2.83 


9. 1 
1 . 78 


7.6 
2.37 


Subsca 1 e 
0 


7 
so 


6.7 
2.89 


5.8 
3.16 


6.7 
3.10 


5.5 
2.86 


8.7 
1 .29 


7.7 
2.21 


Subsca 1 e 
E 


7 
so 


3.0 
2.48 


1 

I 1 .6 
1.46 


* 3.7 
• 2.70 


2.2 
2.12 


4 * 4 

2.68 


3.1 
2.15 


Raven 


7 
so 


35.7 
10.34 


28.9 
10.39 ^ 


- 36.3 
„ 11.15 


31 .3 
10.46 


42.3 
5.71 


38.5 
10.59 




N 


58 


63 


29 


44 


31 
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Table 11 

Planned Contrasts 9 for Selected Group 
Comparisons on Raven Total Score 



Conpar 1 son 




SE A 


One-ta 1 1 
Conf 1 dence 1 nterva 1 b 


¥, « Black Training 
vs. Con tro 1 


6.67 


1 .7686 


2.86 


f 2 * Hispanic Training 
vs. Contro 1 


• .5.03 


2.3247 


.0319 


f 3 * Ang lo Tral n 1 ng 
vs. Control 


3.85 


2.3522 


-1.2072 




2.82 


2.9430 


-2.9483 




1.18 


3.3071 


-5.3019 



A 

a Cel I -means analysis (Marascul lo & Levin, 1976) : V * V - t(SE* ) 




MSW * 94.4624 



One-tall t^ cr 1 1 1 ca I - va I ue for f | , ip^j. and y 3 oquals 2.15; 
<*tot » .10, a?» .033 One-tal l^ t. crltlcal-val ue for ¥^ andv 
V 9 equals 1.96, Ofot • .05, <*y • .025 

Values are statistically significant !f greater then zero. 



Table 12 
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Planned Contrasts for Selected Group 
Compar i sons on Raven Subscales 



" Compar 1 son 




SE A 


One- ta 1 1 
Confidence Interval 9 


Subscale A 
4fj : ' Black Train, vs. Control 

Hispanic Train, vs. Control 
¥3: Anglo Train, vs. Control 
V f, - <?, 


1 .02 
.30 
.10 
.92 
.20 


.3032 
.3985 
.4032 
. 5045 
' .5669 


.3681* 
-.5568 

-.7670 1 
-.0688 / 
-.9112 / 

/ 


Subscale B 
¥ j : Black Train, vs. Control 
f 2 * Hispanic Train, vs. Control 
¥3: Anglo Train, vs. Control 

V Y 2 *" f 3 


1 .59 
1.0 
-.14 
1 .73 
1 .14 


.5538 
.7280 
.7366 
• 9 2 1 6 
1 .0356 


■ / 

.3992* j 
-.5651 / 

-1..7237 / 
-.0763 / 

■ -.8960 ' / 


Subscale C 

Hfj : Black Train, vs. Control 
¥ 2 -' Hispanic Train, vs. Control 
4^: Anglo Train, vs. Control 

V *1 * *3 • 

V *2 - *3 


1 .42 
.90 
1 .55 
-.13 
-.65 


.4859 
.6387 
.6463 
• 8086 
.9087 


' . 7 

•3753* / 
-.4732 / 
.1605* / 
-1.7149 / 
-2.4309 


Subscale 0 

Black train, vs. Control 
^2 : Hispanic Train, vs. Control 
¥3: ' Anglo'Traln. vs. Control 

v - <r 3 


•( .98 
1.12 
1 .00 
-.02 
• 12 


• 5006 

• 6.580 

• 6658 
.8329 
.9360 


-.0962 
-.2946 
-.4314 
-1.6526 
-1.7146 


Subsca 1 e E 

Black Train, vs. Control 
f 2 : HI span Ic Train, vs. Contro 1 
¥3: Anglo Train, vs. Control 

V - *3 

V .*2." *3 


1 .42 
1.51 
1.31 
• 11 
.20 


• 4040 
.5310 
.5373 
.6722 
.7554 


.5515* 
.3684* 
.1549* 
-1.2075 
-1.2805 



a See footnotes Table 11. 

* Statistically significant at a** .033 




1ng group scored significantly higher than their control . group on subscales 
C and E. None of ^the interaction contrasts were significant. 

The Important thing to note 1s the pattern of treatment effects. It 
was hypothesized that training would be effective on the more difficult 
test items. In the case of the Raven subscales, which increase in average 
difficulty, and in the types of items comprising each subscale, it was ex- 
pected that the latter subscales would be affected. A significant effect 
occurred for all groups on the most difficult subscale (E). For Hispanics, 
this was the only subscale on which a significant "effect occurred. Of in- 
terest too, is that no effect occurred on subscale D for any of the 
groups. 

Effect of Training on Raven Items Grouped According to M-demand 

In order to examine the effects of training according to the item's 
complexity (i.e v , information processing demands), the items were grouped 
according to M-demands as identified by 0 Bereiter and Scardamelia (1979). 
According to Bereiter and Scaradamel 1a, Raven test items can be grouped ac- 
cording to their processing requirements. Table 13 gives the M-demand 
-analysis which is taken from their Table 4 (Bereiter & Scardamelia, 1979, 
p. 60). 

The means and Standard deviations of the theoretically M-constructed 
subscales are ^presented 1n Table 14. 

The analyses for each of the 'theoretical ly constructed Raven subscales 
are shown in Table 15. 

The training effect was significant for Black students on all sub- 

*- ■ i 

scales except the most complex subscale, with an information processing de 
mand (M-demand) cf six. Hispanic students -showed a significant training 
effect on items grouped with an^M-demand of five. Anglo students showed 

r 



Table 13 

Grouping of Raven I tem s According to M-Demand* 







0 






\ 

i 










M- 


Deeand 


\ 






1 


2 


3 


4 


\ 

5 


\ 6 


















1 


7 


19 


11 


35 


\ 57 


R 


2 * 


8 


21 
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Table 14 



Average Score 5 and Standard Deviation of Raven Total and of Items 
Grouped According to M-Demand of Subjects with M-tevel of Zero or Greater 







Black 


H 1 s jva n 1 c 


Anglo 


M-Demand 


Tra 1 n 1 ng 


Control 


rral n 1 ng 


Control 


Tra 1 n 1 ng 


Contro 1 




— 

X 


• 96 


• 91 


• 9 7 


Q A 


QQ 


Qft 


ONE 


















SD 


.08 


.15 


.05 


.11 


.05 


• 06 




X 


.76 


.65 


.77 


• 70 




ft 7 

• 0 / 


*-~Twq. 


















SD 


.24 


.27 


.25 


.27 


.12 


.20 




X 


.67 


.50 


.63 


.53 


.79 


.71 


THREE 


















SD 


.29 


.30 


.31 


.30 


.17 


.23 




X 


.48 


.31 


.49 


.39 


.62 


.53 


FOUR 
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SO 


.27 


• 24 


.30 


.26 


.17 


.26 
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.19 


.12 


.27 


.15 


• 35 


.19 


FIVE 


















SD 


.20 


.13 


.19 


.18 


.24 


.14 




X 


.08 


.06 


.07 


.05 


.15 


.05 


















SIX 


















SD 


.14 


.11 


• 16 


.12 


.20 


.10 



Percent passing 



Tabl • 15 



Planned Contrasts for Selected Group 
Comparisons on Raven I teas Grouped According to M-Oemarvd 



Compar 1 son 


f 


T 


One-ta 1 1 
Confidence Interval 8 
Lower Limit 


*M-Demand ■ 1 
Hf|: Black Train, vs. Control .— ■ 

Hispanic Train, vs. Control 
^3 : Ang lo Train, vs, Contro 1 

V ti - y 3 

Y 5' 2 3 


,0486 

• 03J8 
,0069 
,041 7 

• 0269 


.0177 
.0233 
.0236 
.0295 
• 0332 


.0105* 
-,0163 
-,0438 
-,0162 
-,0063 


M-Demand - 2 

V j : Black Train, vs. Control 

Hispanic Tra 1 n • vs,* Control 
^3: Anglo Train, vs. Control 

V *1 -"*3 ' 


.1091 
.0704 
.0193 
.0896 
.0309 


.0433 
.0569 
.0576 

• 0720 

• 08 10 


,0160* 
-,0519 
-, 1043 
-,0516 
-, 1078 


M-Demand ■ 3 
V,: Black Train, vs. Control 
^2' Hispanic Train, vs. Control 
^3: Anglo Train, vs. Control 

V • *| - Vy 
.V *2 " *3 


. 1670 
. 1032 
.0778 
.0892 
• 0234 


• 0505 

• 0664 

• 0671 

• 0840 
.0944 


,0584* 
-,0396 
-,0665 
-,0755 
-.1597 


M-Demand * 4 
V y? Black Train, vs. Control 
^2 : Hispanic Train, vs. Control 
^3: Anglo Train, vs. Control 

V f l - *3 


• 1696 
. 1067 
.0936 
.0760 

• 0131 


• 0461 

• 0606 
.0613 
.0767 

• 0862 


.0705 
-.0235 
-.0382 
-.0744 
-. 1 559 

ft r 


M -Demand ■ 3 
^| : Black Train, vs. Control 
H*2 : Hispanic Train, vs. Control 
^3! Anglo Train, vs. Control 

5« *l - *3 ' 


.07 56 
. 1 179 
.1627 
-.0871 
-.0448 


.0327 
.0429 
.0434 
.0543 
.061 1 


.0053* 
.0257* 
.0694* 
-. 1935 
-.1645 


.M-Demand "6 

: Black Train, vs. Control 

Hispanic Tr il n # vs. Control 
^3 : A 0 n 9^° Train, vs. Control 

V ¥, - <T 3 

V » 2 - ¥3 


• 0220 

.0235 

.0925 

-.0705 

-.0690 


• 0253 

• 0333 

• 0337 

• 0421 

• 0473 


-.0324 
-.0481 

.0200* 
-. 1530 
-.1617 



See footnotes Table 11. 
SlgnTf leant at* - .033 



significant training effects on Items with an M-demand of five and six. 
Again, none of the Interactions was statistically significant. 

/The results Indicate that training was most effective for Black stu- 
dents. The fact that Raven items with a M-demand of six were not affected 
for Blacks and Hispanlcs 1s not surprising since there were no subjects in 
either training group with an M-level of six. ' * 

The fact that none of the interactions was significant indicates that 
in those cases where training resulted in higher performance, the result 
was not greater for the minority students than for the Anglo students. 

In general, the hypothesis that training would result 1n significant 
improvement was supported for the Black and Hispanic students on the total 
•Raven. On Raven Items grouped according to M-demand, only Black students 
showed consistent Improvement. The hypothesis that training would lead to 
Improved performance on the Raven did not hold for Anglo students. 

The hypothesis that the gains for minorities would be significantly 
greater than for Anglo students was not supported at all. Although minor- 
ity students showed significant differences over their control groups while 
the Anglo students did not, (which would lend some support to the statement 
that training was more effective for minority subjects). Overall this 
simply was not verified on the basis of the statistical interpretation of 
the statement applied in the analysis. 

In summary, training was most effective for Black students on all of 
the analyses reported. Training was effective for Hispanics on the most 
difficult subscale of the Raven as well as total score. 

Anglo training/control group differences were" not significant on the 
Raven total, and with this one exception, generally followed the pattern re- 
ported for Hispanlcs. Again, although the hypothesis of the .effectiveness 



of training was supported for minority subjects, the hypothesis of greater 
effect for minority students was not supported. On the other hand, train- 
ing was often effective for minority. subjects, speci fically Blacks, while 
it was not for Anglo students. 

Finally, if one considers that the average M-level was generally less 
for minority subjects, and that the highest M-levels for minority control 

0 

subjects was five, then training was effective on the items that are within 
their so-called processing capacity. The effect of training according to 
processing level is reported in the next section. 
Effects of Training According to Processing Capacity 

In this section the results of the effects of training according to 
subjects' M-level 1s reported. The results are reported for the Raven 
total and by theoretically constructed subscales (I.e., Items grouped 
according to processing demands). The data are reported in two ways. 
First, Training ief feet results by M-level is reported for all subjects* 
This is then contrasted with Training effects when M-level is statistically 
controlled through analysis of covariance. 
Raven Total by M-level ' 

Results of Raven total test performance by M-level is given in Table 
16. The first thing to note is the apparent monotonic relationship between 
average scores and increases in M-level. Perhaps more interesting, how- 
ever, are the differences between groups within a given M-level. While 
there are some differences in>total score (given small N's), the differ- 
ences are not so great as to conclude that there are major differences 
overall between minority and Anglo subjects. That is, the main differences 
in the total scores across groups are due to differences in M-level within 
groups. This is Interesting because M-level is a developmental variable, 
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\ ■ . 

Avaraga Ravin Scora of Control and Training Subjecfs by M-Level 
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vis-a-vis Plaget, which presumably Increases with age. Since more minority 

subjects show lower M-levels,Mt 1s~not .surprising that they s.c ore lower or 

the Raven tesj. ^In this case, one could conclude that the observed dlffer- 

ences on the, Raven are. developmental , e.g., that minority subjects exhibit 

a lower* or "retarded* developmental, rate (e.g., Jensen, 1974b). 

On the other hand, the results for the training groups, which also 

show more minority subjects with lower M-level s, suggests that there 1s 

little within M-level difference 1n Raven performance after training. This 

1s particularly true at M-level greater than one, and more so for Black' 

subjects. Thus, while group differences on the Raven test are related to 

• - < ,j 

differences 1n M-level, 1t is not because of .retarded development. A pure- 

ly "developmental lag" explanation does not explain these results since 
cognitive developmental growth generally takes longer than two weeks, and 
is usually unaffected by training. Since there is little difference 1n 
Raven scores w1th1n«M-levels for Training subjects ,it 1s more likely that ' 
minority subjects simply do not comprehend t^e task requirements Of the 
Raven test 1n the first place. *. • » / 

Group Comparison with Apfllo Control Students 

Analysis of the dqta 1n Table 16 fallows, from hypothesis that training 
would be more effective for minority subjects. Stated another way, the 
usual observed differences between minority and Anglo students would be re- 
duced or eliminated by training. 

* 1 

ti ** 

The ANCOVA was computed through regression analysis by dummy coding, 
group membership and treating the Anglo control group as the references w 
category. In this way comparisons between each group and the 'Anglo contr&l 
are output directly from the analysis. With this procedure there are five 



comparisons, each tested at alpha equal .02 for each contrast and .10 over 
all comparisons. ' 

An alpha of .10 was selected since the complete factorial design 
(ethnic group by treatment) allows for an alpha of .15. However, since an 
assumption of ANCOVA is homogeneity of regression and because the hypothe- 
sis does not technically require a group-by-treatment analysis, the remain- 
ing .05 was allocated to a statistical test of the homogeneity assumption. 
The test for homogeneity of regression is reported first, followed by the 
results of the ANCOVA. 

Test for Homogeneity of Regression . The statistical test for paral- 
lelism of regression lines was performed according to the procedures dd- 
scribed in Kerlinger and Pedhauzer (1973). In their procedure, an F-test 
is performed on the difference in R 2 obtained for the regression of Raven 
onto group membership and separate vectors representing M-level for each 
group, and the R 2 for Raven regressed onto group membership and a single 
vector representing M-level. The F formula is: 



(*V* 2 2)/*r k 2 

Ff * 
(l-R 2 l)/N-k 2 -k 2 

— Mt>erer R 2 p* Raven onto group and M-level vectors for each group. 
\ R^ 2 * Raven onto group and M-level. 

* Number of vectors for ll 2 i =11 
k*2 * Number of vectors for R 2 2 ■ 6 

The computed R 2, s are .384 for Raven onto group and separate M-level 
vector and .375 for Raven onto group and a single M-level vector. For 
these values F * .687, which 1s not significant for F5 58 .2.21. Thus, the 
assumption of parallelism 1s supported. 



ANCOVA Results . The results for the ANCOVA controlling for M-level 
across all groups are summarized in table 17. The result^ show that when 
each group is compared with the Anglo control group (as criterion) signifi- 
cant differences occur between minority Control and Ang/lo Control group 
students whereas no differences occur for minority Training group students 
and Anglo Control group students, nor do differences /occur betwen the Anglo 
Training and Control groups/ Clearly, training had/the effect of eliminat- 
ing the initial differencesvbetween minority and Anglo students. 
Group Comparisons on Theoretically Defined Raven yubscales 

Homogeneity of Regression , The same analysi/s was performed for each 
theoretically constructed Raven subscale. The statistical test for paral- 
lelism for each subscale is reported in Table 1$. 

The R 2, s in Table 18 represent regressions' of each subscale onto 
group membership (defined by the six groups described above) and a single 
vector representing M-level, and regressions 6f the subscales onto group 
membership and separate! vectors representing /M-level according group 
membership. The difference in R?'s is tested by an F-test and reprWnts 
the test for homogeneity \f regressions amopg the groups (Kerlinger ana 
Pedhauzer, 1973). 

Results of the F-test Necessitates re/jection of the .hypothesis of 
parallel regression lines fo\ the "Three-^" subscale and indicates that 
analysis of covariance is Inappropriate. / The remaining five subscales. are 
not significant, so an ANCOVA was performed for these subscales. Table 19 
presents the ANCOVA results. 

ANCOVA . The data in Table ^19 represent significant tests between five 
of the groups and. the Anglo Control grolup. Beta (B) is the deviation of 
each group (i.e., effect) from the mean score obtained on the given sub 7 



Table 17 



Suiiary ANCOVA for Comparisons of Study Groups with Anglo Control 
,Group Subjects on Raven Test vltji M-tevel as Covarlate 



Variable 


8 


SE 


F 


M- lev el 


3.0141 


.3314 


82.74* 


Bl ack Tra tn 1 ng 

0 


1.7512 


1.8389 


.91 


Hispanic Training 


-0.3134 


2.0977 


• 02 


Ang 1 o Tra \ n 1 ng 


3.9128 


2.0489 


3.65 


Black Control 


-5.961 1 


1.7831 


11.18** 


HI span Ic Contro 1 


-6.4305 


1.8768 


11.74** 


Anglo Control (Constant) 


27.1313 







* Significant at p < .03, F 2.21 
** Significant at p < .02, F > 5.38 



Table 18 



Homogeneity Test for the Regression of Raven M-Demand 
Subscale* onto M-L*vet In Black, Hispanic and Anglo 
Training and Control Groups 



Raven 
Subscale 


R 2 

(M-Leve l/Group) 


R 2 

(G^-oup/CGroup x K-Level3) 


F a 


M-Deaand"1 


... .13524 


.15200 


0.99 


M-Demand*2 

'a I 


\ .26229 


.28767 


1 .79 


M-Demand*3/ 


.2*867 


• 288 1 1 


2.78* 


M-Demand*4 


.367 1 1 


.37484 


0.62 


M-D*mand-5 , 


.26594 


.28765 


1 .68 


M-Deman d*6 


.07766 


.10280 


1 .21 



a 

<R 2 - R 2 )/( 11-6) 
F ■ 1 (Kerllnger and Pedhauzer, 1974) 

( 1 - R 2 )/263 -11-1) 



• Significant p < .05, F 5, 2.21 



< Table 19 



Summary ANCOVA of Study Groups with 
Anglo Control Group ort Raven M-Demand Subscales: M-Level as Covarlate 



Subsca 1 • 
M-Demand a 


Var 1 ab 1 e 


B 


SE 


F 


M-1 


M-level 

B 1 ack Tr a 1 n 1 ng 
Hispanic Training 
" Anglo Tra 1 n I ng 
Black Control 
H 1 span 1 c Contro 1 ' 


•.1714 
.0351 * 
.0336 
.0792 
-.5589 
-.3999 


.0407 
.2257 
.2573 
.2513 

• 2187 

• 2302 


17.78* 
0.02 
0.02 
0.10 
6.53** 
3.02 


M*2 


M-Level 

Black Training 
H 1 span 1 c Tra 1 n 1 ng 
Anglo Training 
B 1 ack Contro 1 
Hispanic Control 


• 8463 
-.2701 
-.9228 

• 2913 
-2.0909 
-2.2151 


• 1 194 

• 6625 
.7557 
.738 1 

• 6424 
.6761 


50.264* 

0 . 1.7 

1.49 

0.15 
10.59** 
10.73** 


MM 


M-Level 

Black Training 
Hispanic Training * 
Anglo Tra i n 1 ng 
B 1 ac k Contro 1 
HI spanl c Control 


• 8161 
.7424 

• 1518 

• 9536 
- 1 .2375 
-1.2111 


• 0853 
.4735 

• 5401 
.5275 

• 4591 

• 4832 


91.52* 
2.46 
0.08 
3.27 
7.27** 
6.28** 


M«5 


M-Level 

B 1 ack Tra 1 n 1 ng 
H 1 span 1 c Tra i n i ng 
A n g 1 a Training 
B 1 ack Contro 1 

H "T co " ,ro ' 


• 4219 
.6593- 
1.0667 
1.6363 
-0.2437 
-0.2647 


• 0653 

• 3622 

• 4132 

• 4036 

• 3512 
.3697^ 


41 .79* 

3.31 

6.67** 
16.44** 

0.48 

0.51 


M-6 


M-Level 

B 1 ack Tra 1 n 1 ng 
H 1 span 1 c Tra 1 n 1 ng 
Anglo Training 
Bl ack Control 
Hispanic Control 


• 0650 

• 1998 

• 1059 
.3715 
.0891 

-.0116 


• 0214 

• 1 189 
.1357 

• 1325 
.1 153 
.1214 


9.19* 

2.82 

0.61 

7.86** 

0.60 

0.01 



r 

The M-Demand- * '3 subscale did not meet the homogeneity assumption 
so ANCOVA was not performed for this subscale. 



• Significant at p < .05, F > 2.21 
•* Significant at p < .02, F > 5.38 



scale by the Anglo Control group subject. The F-value is tested for si g- 
nificance at alpha equal .05 for the covariant (M-level) and .02 for each 
of the group b effects. 

The covariant, M-level, was significant for all subscales. Black con- 
trol group students scored significantly different than Anglo control stu- 
dents on subscales with M-demands of one, two and four, but ndt on the 
Five-M and Six- M demand subscales. Black training group subjects showed a 
significant difference on the Five-M subscale only. Anglo training group 
students scored significantly different on the Five-M and Six-M subscales. 

The direction of the differences was such that minority control groups 

at . 

always scored lower than Anglo control students while training students in 
all ethnic? groups scored higher. These results indicate that when M-level 
is controlled, differences between minority and Anglo control stu- dents- 
occurs primarily on the less complex subscales. No differences occur on 
the most complex subscales except in one instance where Hispanic students 
scored higher. In all other instances, training group subjects scored 
higher, although the differences are Rot significant. This finding is in 
contrast to reports that minority subjects perform lower on more complex 
tasks when compared to Anglo students (Jensen, 1974a, 1980). When develop- 
mental level is controlled, there are no differences 1n performance. More- 
over, given training, it is apparent by observation of mean scores in Table 
16 that there are no differences between minority and Anglo students. 

Overall, the results indicate that the main difference in performance 
on the Raven is due to differences in performance on the developmental mea- 
sure. This might lead one to conclude that the difference is "develop- 
mental lag/' However, observation of the mean scores within a given 
M-level together with the effect of training suggest that this is not the 



case. The most appealing argument 1s that there 1s a genera] "test taking 

skill" reflected both 1n the M-level measure and in the performance of 

Control group subjects. Training provided the test taking experience 
necessary to produce group parity in performance. 



CHAPTER IV 
RESULTS: CULTURE-LOADING 
Examination of the Culture-Loadgd hypothesis 
Results of the culture-loading hypothesis are presented in three sec- 
tions. In the first section the c procedure for identifying culture-loaded 
items and its assumptions are described/ The results of applying the pro- 
cedure are then presented. Since different items were identified as 
culture-loaded, results are reported separately for Black and Hispanics. 
The second section examines the validity of the procedure by comparing 
group performance against outcomes considered to be consistent with a 
culture-loaded interpretation. In this section expected outcomes are first 
described, then an anlysis of group performance is presented at the indivi- 
dual item level and for items grouped according to processsing demands. 
Results are reported separately for Blacks and Hispanics. The final sec- 
tion is a discussion of the results of the study. 

Identification of Culture-Loaded Test Items 

Procedure 

Items are identified as culture-loaded when there are differences in 
the information processing demands of the item for minority control sub- 
jects in comparison to Anglo control group subjects. The criteria for de- 



termlning whether the information processing demand of an item is different 
for minority subjects is made by examining the percentage passing at the 
M-level equal to the M-demand of the item. If there is a 25X or more dif- 
ference between minority and Anglo subjects of the same M-level, then the J 
Item 1s determined to require a greater processing capacity for minority J? 
subjects and hence 1s identified as culture-loaded. 

In a few instances minority processing demands were nearly equal to 
that of Anglo subjects, but differences equal to or greater than 25% oc- 
curred at higher M-level s. In these cases "items were also judged to re- 
quire a greater processing demand, i.e., were culture-loaded. This was 
done separately for Black and Hispanic students. 

The procedure is consistent with the theoretical discussion provided 
in previous chapters. That 1s, that culture-loading effects the processing 
demands of a task, and that differences 1n processing requirements caft 
occur because of difference in processsing strategies, or because of over 
sensitivity to misleading cues. The difference, however, is hypothesized 
to be due to experience rather tfian to ability per se. In either case, the 
processing requirements of the task are affected. 

Total percent passing for each group is ignored 1n this process in 
favor of group differences within M-levels. Thus, culture-loaded items may . 
or may not show ethnic group differences. Total percent passing is not Im- 
portant for two reasons. First, 1t assumes a priori that no group differ- 
ences should exist. Second, differences are expected since the number of 
subjects at each M-level 1s generally not the same across groups. That 1s„ 
group differences 1n percent passing are not meaningful from a develop- 
mental perspective since the. groups were not matched on developmental vari - 
ables. 



In contrast, the assumption of no group differences within M-levels in 
order to identify items as culture-loaded is based>^n the theoretical posi- 
tion that subjects of the same M-level are developmental ly equal. Thus, 
the argument is that one can assume that no group differences should exist. 
The reasonableness of this assumption, of course, rests with the outcome of 
the analysis reported in this part of the study. 

A further assumption in the procedure for identifying culture-loaded 
test items is that the FIT itself is unbiased. While this assumption may 
be questioned, it is clear that the finding of group differences alone is 
not sufficient. Moreover, if the FIT is biased, the most likely outcome 
would be that minority subjects 1 M-level is really higher than indicated. 
This means that Anglo subjects are being compared with minority subjects 
who can process more information. However, this would simply lead to less 
items being identified as culture-loaded and would serve to provide a more 
conservative test of the hypothesis. 

Finally, since there are no, or very few minority training group sub- 
jects with an M-level of six, items with an M-demand of six could not be 
included as part of the examination of culture-loading. 

» 

Research Questions " 
The research fssues examined in this chapter are: * 

1. Raven test items identified, a-priori, as culture-loaded will show 
greater effect due to training than items not identified as 
culture-loaded. That is, training will have its greatest effect 
on the items Identified as culture-loaded. 

2. Greater group differences will be found on culture-loaded than on 
non-culture loaded test items. 



3. The training effect for minority students on culture-loaded items, 
will be greater than the training effect for Anglo students* 

4* Training will reduce or eliminate group differences on culture 
loaded test Items; the effect will be greater than for non-culture 
loaded test Items. ' - 



* Results 

Application of the criteria described above resulted in 26 items Iden- 
tified as culture-loaded for Blacks and Hispanics. In. most instances the <■ 
same items were identified as culture-loaded* There were, however, a few 
deviations from this pattern. Culture-loaded items are shown, according to 
M-demand, in Table 20/ The letter next to the item indicates the original 
Raven subscale to >,hich the item belongs. 

The totaTnumber and proportion of items identified as culture-loaded 
within each M-demand subscale°for Black and Hispanic students are also 
shown in Table 20. The highest percentage of culture-loaded Items were 

0 

found "in subscales that are within the processing capacity' (M-level ) of the 
students. Items with M-demands of one showed few Items as culture-loaded 
and were probably influenced by a ceiling-effect 1n that' the majority of 
subjects 1n each ethnic group were able to solve the items. 

f Examination of the Hems according to original Raven subscales indi- 
cates that items of average to above average item difficulty were more 
often Identified as. culture-loaded. This observation, together^ with the a 
item's M-demand suggest that floor and ceiling-effects are a factor 1n 
Identifying culture-loaded Items according to the criteria used. That Is, 
Items within the processing and difficulty level of the subjects are more 
likely to be identified as culture-loaded. This 1i not to say that very 



Table 20 

Culture-Loaded Raven Test Items 
for Black and Hispanic Students by M-Demand 
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Since there were no minority training group subjects with an 
M-tevel of six, Items with an M-Demand of six could not be Included 



easy or very difficult Items are. not culture-loaded. Rather, they simply 
do not provide a range in Which the particular criteria had enough pov4r to 

/ • 

detect. 1 / 

\ ■ / 

"In all, this means that there are probably some Items that w^re not 

» j * ■. • . 

identified as culture-loaded. For this reason and because of possible bias 

in the FIT, 1t 1s probable that not all culture-loaded items were identl- x 
fied. * 
Validation of the Culture-Loading Hypothesis 
Outcomes Consistent with a Culture-Loaded Interpretation 
Basically a culture-loaded hypothesis is one in which test perforn 
is in part a function of characteristics unique to the group but npt: neces- 
sarily related to what the test is supposed to measure in the first place. 
Once items are identified, outcomes can be specified which, if verified, 
would support a culture-loaded interpretation and hence a culture-loaded 
hypothesis. There are three basic research hypotheses of interest in the 
evaluation of the validity of identifying culture-loaded items on the basis 
of the proposed theoretical information processing model. Positive find- 
ings would support the approach used in this study and a culture-loaded in- 

» 

terpretation. 

One expected outcome 1s that greater observed group differences would 
occur on items identified as culture-loaded than on non-cultur? loa°ded 
items. A second expectation is that training would effect culture-loaded 
items more than non-culture loaded items. Finally, it 1s expected that 
training would be more effective for minority students th^n for Anglo stud- 
ents on the culture-loaded items but not necessarily on the non-culture 
loaded items. % 



Examination of -the expected outcomes 1s, made by comparing various dif- 
ferences In performance between minority and Anglo groups., and Training and 
Control groups on culture-loaded and non-culture loadeo Items 1qd1 vidually, 
and on the subscale totals when the Hems are grouped according to M- 
demand. Five group comparisons are related to the expected outcomes: 1) 
between minority and Anglo control groups; 2) between minority and Anglo 
training groups; 3) between training and control subjects in the minority 
group; 4) between training and control subjects in the Anglo group; and, 
5) the Interaction of treatment by ethnicity. 

The comparisons correspond to specific culture-loading expectations. 
Comparisons one and two concern expectations about the effect of training , 
on group differences. It is expected that culture-loaded items would show 
greater between ethnic group differences than non-culture loaded items. 
Items which show such a trend would be consistent with the culture-loading 
expectation. 

0 

' x 

Comparisons three and four concern the expectation that training would 
have a greater effect on minority group differences (i.e., between training 
and control groups) than on Anglo group differences. Thus, it would be ex- 
pected that training would be more effective for minorities on culture- 
loaded items* than on non-culture loaded Items. The same outcome would not 
necessarily be /expected for Anglo comparisons. 

Finally, comparison five concerns the expectation* that training would 
be more effective for minorities than for Anglo students on the culture- 
loaded items, the only statement made concerning the expected effect of 
training for Anglo students on non-culture loaded Hems, is that it should 
be rnore ,or less equal to minority subjects. This will be discussed in more 
detail once the data has been examined. Table 2J provides a summary of the 
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Table 2.1 

\ .Summary' of Culture-Loaded Outcomes 
for Specific Group Comparisons 
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i ■ ■ ■• ■ 
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4?) and posl tl ve. 
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outcomes expected on the five comparisons wh\ch would be consistent with a 
culture-loaded hypotheses. 

In order to test these outcomes, percent passing each item for each 



ethnic group was computed for subjects 
to the M-demands of the items. The ra 



whose M«rlevel Was equal or greater 



tional 



e for\this is that comparisons 

between groups, and the effects of training, would\ be examined only on 

.1 i • \ . 

those subjects who possess the minimum processing resources required for a 
given item. This takes into account, (somewhat, the discrepancy in propor- 
tion of minority subjects obtaining lciwer M-levels on\tKe FIT. More impor- 
tant, however, is that it reflects ttjs position that training would most 
likely be effective for those subjects who possess the ^minimum resources in 
the first place. ..: 1 \ 

Examination of the expected outcomes is made both for culture-loaded 
and non-culture loaded items individually, and for culture-loaded and non- 
culture loaded subscale totals formed by grouping the items according to 
M-demand. Because of the number of statistical tests required, examination 

of individual items is made by observation of the pattern of differences in 

•■ ' I 
performances. Analysis of variance* with planned comparisons is used to ex- 
ami ne ; the hypothesis of interest for the total score obtained when 
culture-loaded and non-culture loaded items are grouped according to 
M-demand. In the following the results of individual itemjdata are pre- 
sented first. Following this, analyses of culture-loaded ajnd non-culture 
loaded subscales are statistically examined. 
Item Analysis for Culture-Loading 

Individual item results are examined by M-demand for Bracks and His- 



* panics separately. Item data is reported according to the p 
, fined five group comparisons in Ubles 22 and 23 for Blanks 



reviously de- 
and Hispanics 
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Comparison on Culture-Loaded and#Non-Cu I ture Loaded 
Raven Items for Black and Anglo Training and Control Groups 
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•Comparison on Culture-Loaded and Non-Culture Loaded 
Raven Items for Hispanic and Anglo Training and Control Groups 
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1 « Control: H I span I c-Ang I o 

2 Training: Hlspan Ic-Anglo 

3 - Hispanic: Training-Control 

4 ■ Anglo: Training-Control \ . * * 

5 - Jnteract Ion : Treatment by Ethnicity a* 



respectively. The comparisons in Tables 22 and 23 represent differences in 
percent passing when a subjects' M-level in each group are matched w^th the 
M-demand of the item. 

Since statistical analyses are not usually based on item data, a cri- 
teria of 4% was arbitrarily selected as demonstrating a difference in any 
particular comparison* -Hhus, if a treatment effect is expected to be posi- 
tive and relatively large, it had to be at least 4% to be considered as ■ 
supportive. Similarly, if a comparison i$ anticipated to show no effect, 
it had to be less than 4X. In the following discussion this criteria is 
applied when judging whether a particular outcome is consistent with a cul- 
ture-loaded expectation. Results are presented for Blacks first, then 
Hispanics. 

Blacks : Observation of Table 21 indicates that all of the items wi*th 
an M-demand of one followed the expected pattern. First, larger group dif- 
ferences are found on culture-loated items than on non-culture loaded 
items. Second, the group difference on culture-loaded items is reduced 
through training. This did not occur for non-culture loaded items. Third, 
the effefet of training was-greater on culture-loaded items thap on non- 
culture loaded items. Finally, the' gains for Black students are larger 
than for" Anglo students. ° o 

For itemswith an M-demand of two, 10 of 14 follow the expected direc- 
ts »- » - * ... 

tion in support of- the culture-loaded expectation. Nine of 11 of Jthe** 
Jhree-M-demand items Show, a pattern consistent with culture-loading and 8 
of 10 of the Four-M and the F1ve-M items come out as predicted. Two appar- 
ent culture-loaded items were not detected on the F1ve-M subscaler. ^ 

Overall, 46 (or 82%) of the 56 items examined showed results consis- 
tent with a culture-loaded expectation* . Of the 10 items that did not con- 



form to the model, four non-culture loaded Items produced results which 
suggested they were misidentif ied. Three culture-loaded items produced re- 
sults somewhat consistent with a culture-loaded classification, but yielded 
interaction gains in favor of minorities of less .than 4%. Thus, these were 
not counted as supportive of the hypothesis. The remaining three items 
were culture-loaded but produced results opposite of expectation. Two 
showed interactions, in favor of Anglo students (5% and 8%), and one showed 
no difference (2%) instead qf a gain for minority students. 

Hispanics : Results for Hispanics (Table 22) reveal that One-M items 
produce outcomes consistent with the culture-loaded interpretation. Two 
non-culture loaded items, however, show a 5% and. 8% gain for minority stu- 
dents. All but two of the Two-M items appear correctly identified. The 
two errors were items^ identified as culture-loaded. One showed a 5% gain 
fpr Anglos and the other, while consistent with the culture-loaded expecta- 
tions^ showed only a 4% gain for Hispanic students over Anglos. Three of- 
the 11 Three-M items did not follow the expected pattern. Three culture- 
loaded items shpwed no gain for minorities everL though they were consistent- 
on the other comparisons. , * 

Of the 10 Fcur-M items,. two non-culture loaded-items showed gains of 
5% and 14% in favor of Hispanics, inconsistent with their identification. 
Three non-culture loaded items" in the Five-M subscale produced results con- 
sistent with a culture-loadeid identification. 

o « • o 

I 

Overall, ;44, or 79% of the 56 items produced patterns consistent with 
a culture-loaded hypothesis. Seven of the 12 misidenti fied items were 
classified as non-culture. loaded when in fact they produced results expect- 
ed of culture-loaded items. The gain for these items ranged from 5% to 20% 
in favor of Hispanics. The remaining four errors were identified as 



culture-loaded. Of these, two produced gains 1n favor of Anglos and two 
showed no gain for either group. 

In general, the results support a culture-loaded explanation and thus 
support the definition based on information processing capacity. In most 
cases Items identified as culture-loaded and non-culture loaded produced 
results in the desired directions. In several of the items that were mis- 
identiTied; many either showed no gain for minority students or were 
thought to Be non-culture loaded but produced a large gain in favor of mi- 
norlty students. These outcomes are consistent with the previous statement 
that the procedure for identifying cal ture-loaded items is probably conser- 
vative, i.e., is likely to miss some culture-loaded items.' 
Summary * 

r The above discussion was based on subjective judgements about the ex- 
pected size of the differences in comparisons? A conservative criteria was 
applied, although some may certainly wish to argue this point. The subjec- 
tive rule was to judge Items* as agreeing with the culture-loaded hypothesis 
1f: a) results onfall 5 comparisons were 1n the predicted direction, b) 
the gains for minority students on- culture-loaded Items were greater than 
Q% over Anglo gains, and g) the galni for minority students? on non-culture 
loaded items was 4% or less. If any of these three criteria were not sat- 
isfled, the results, were not considered to support a culture-loaded inter- 
pretation. The main point, however, 1s that T;he focus was mainly on the 
constancy of "the pattern produced. Tojthis extent there was certainly a 
constancy observed. The overall results are summarized in Table 24.* 

Ill general the procedure for identifying culture-loaded items appeared 
to be consistent with expected outcomes. Items Identified as culture- 
loaded: 1) produced greater group differences than non-culture loaded 



Table 24 



Summary Resu Its of Agreement 0 in Identification of 
Culture Loaded and Non-Culture Loaded 
Raven Test I terns' 



"BLACKS 


Predicted. 


Met Expectation 
Culture Loaded Non- 
Items 


Cu 1 ture Loaded 
Items 


Total 


Culture Loaded 


20 


6 


26 


Non-Cu 1 ture 
Loaded 


5 


25, * 


30 


Total 


4 25 


31 


56 : 



HISPANjCS " 


- ~..« 

Pred 1 cted 


Met Expectation 
-Culture Loaded Non-Culture Loade 
1 terns 1 tems 


. Total 


Culture Loaded 




26 


Non-Cu 1 ture 
Loaded 


7 25 

T . 


30 


Total 


, 28 28 


56 





Underline Indicates agreement between Items predicted as cu I ture- loaded 
or non-culture loaded and those that met the expected criteria. 
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items; 2) had the difference reduced or eliminated through training; 3) 

i - 

showed greater training effects; and 4) produced greater gains for minority 
students due to training than for Anglo students. Non-culture loaded items 
generally produced the opposite results. 

The .following section presents the application of a statistical cri- 
teria to the results of culture-loaded and non-culture loaded subscales. 
Analysis of Culture-Loading on Items Grouped According to M-demand 

Statistical tests of the culture loading hypothesis were made by 
grouping items of the same M-demand i r nto two subscales according to whether 
the items are culture-loaded or not culture-loaded. An analysis of the 
five comparisons described above was then performed. The Cell -means model 
of analysis was used to test the statistical significance of the five con- 
trasts. The "mean squares within" term for each subscale was obtained* 
through a "one-way" ANCOVA between the particular groups included in the 
nalysis i.e., Black-Anglo and Hispanic-Anglo^ Contrasts were computed 
separately because different items were identified as culture-loaded for 
Black and Hispanic groups. 

Between group contrasts were tested with alpha controlled at .025 
(one-tail fielding a total alpha of .10.^ The interaction contrast was 
tested at alpha equal to .05 (one tail). This is consistent with a .15 
alpha allowed in the^full factorial two-way analysis of variance with 
interaction r In the following, the results are reported separately, first 
for Blacks then Hispanics. ~" 

Blacks . Table 25 gives the percent of Blacks and Anglos passing cul- 
ture-loaded and non-culture loaded subscales defined according to M-demand. 

"statistical analysis of the data 1s provided -.in Table 26. The data Jn 
Table 26 show the difference between percent passing for each comparison, 
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Table 26 J 85^ 

Group Comparisons on Culture Loaded and Non-Culture Loaded 
Items for Black and Anglo Training and Control Group Students 
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standard error term, t-value, and the percent of variance explained by the 
contrast. The data are organized according to item M-demand and represent 
performance of subjects with an M-level equalto or greater than the 
M-demand of the items in *the subscale. Since no training subjects in the 
Black or Hispanic groups, and very few in thp control group had M-l,evels of 
six,* only subjects with an M-level of 1 to 5 are included in the analysis 
and sjubscale six is not examined. 

Statistical significance of the t-values is based on planned compar- 
isor|J (Kirk, 1968). All comparisons are one-tail according to the expected 
direction. The expected direction appears in the footnote at the bottom of 
Table 26 for those comparisons_J:hat are si gn i f i cant . Eta-squared ( n 2) 
represents the amount of explained variance and has been multiplied by 100 
to convert it to a percentage; in most cases only explained variances of at 
least 2% are reported. AJso, it is pointed J out that n ^ is not indepen- 
dent across comparisons since the contrasts are not orthogonal. Neverthe- 
less, it is presented-- because of the decrease in sample size as fewer stu- 
dents are included in the analysis at higher M-levels. 

Results of items with' an M-demand of one show a training effect for 

Black students on both culjtu're-loaded and non-culture loaded items. None 

\ 

of the other contrasts is significant The explained variance on 
culture-loaded items is,. however, larger than on non-culture loaded items. 
The interaction term which indicates a 20% greater gain for Blacks over 
Anglos was also not significant -and accounted for only 2% of the explained . 

variance. - (^^^^ ° i 

Two-M items showed the same pattern of significance foj^'culture-loaded 
and non-culture loaded items. For culture-loaded items neither the train- 
ing effect for the Blacks nor the interaction was significant. The 



explained variance, however, is noticibly larger between control Blacks and 
Anglos on culture-loaded items r Tn comparison to the percent explained in 
the training groups and for non-culture loaded items, this suggests that 
group differences were greater on culture-loaded and that the differences 
was reduced through training. 

Results of items with an M-demand of four are exactly* consistent with 
all culture-loading expectations. That is, greater group differences occur 
on culture-loaded items, but training eliminated the difference. The same 
pattern, .however, did not occur on non-culture loaded items, and Black stu- 
dents showed significantly greater gains due to training than Anglo stu- 
dents, pf note is that training had the effect of eliminating the original 
18.. 5% variance accounted for by ethnic group differences on these items. 

Resjults for Five-M items are also consistent with the culture-loaded 
expectation. While between group differences were not significant on 

• / 

culture-loaded items (probably because of small N's), the significant 
training effect for, Blacks (23% variance accounted for) and significant in- 
teraction (10% explained variance) indicate that a culture-loaded inter- 
pretatioh is supported. In contrast, non-culture loaded items showed a 
significant training effect for Aftglo students and also accounted for 23% s 
explained- variance. * . 

When subscales'are combined to form a total culture-loaded and 
non-culture loaded score, results for all subjects (M-levels 1 thru 5) are 
consistent with a cultu reloaded. hypothesis. Sinnificant ethnic grbup dif- 
ferences^ occur on culture-loaded items but not on non-culture loaded items-; 
the difference is reduced after training. Treintng'on culture-rloaded items 
is effective for Blacks only, and a significant gain occurs for Blacks. On 
non-culture loaded items, training produced significant gains for both 



groups aad resulted 1n significant group differences in performance oh 
non-culture loaded items. Nevertheless, group differences after training , 
accounted for 5% of the variance on 'culture-loaded items prior tp 

" i 

training. ! 

V 

\ 

In Wms of the culture-loaded hypothesis results are consistent with 
expectations. That is, 1) significant group differences occur on 
culture-loaded items, 2) this difference is reduced or virtually eliminated 
by training, 3) training was differentially effective for Blacks on 
culture-loaded items, and 4) the gains for Black students on culture-loaded 
items was generally greater than that of Anglo students. . 

Hispanics : Percent passing items identified as culture-loaded and * 
non-culture loaded for Hispanics is shown in Table 27« The percent passing 
. each subscale is based on students with an M-level at least equal to the 
M-demand of th6*»item. 

Statistical .analysis of the data for- Hispanics is shown in Table 29* 
Total performance is given at the bottom of Table 28 and is computed on 
students wit-fr-M-levels between 1 and 5 inclusive. 

The pattern of performance for Hispanics is .consistent with a 
c^ture-loaded hypothesis as follows:' 1) si grji ficagt group difference^ on 
culture loaded but not non-culture loaded items, 2) elimination of group 
differences on culture-loaded items due to training, 3) significant train- 
ing effects for Hispanics bat not for Anglos on culture-loaded items, <*nd 
4) significantly* greater gains for Hispanics over Anglos on culture-loaded 
items. 

Ih general, the results follow the expected pattern. Only items 0 vrith 
an M-demand of one, however, follow the pattern completely. At least part 
of the- hypothesis was supported on. the remaining subscales"* In some case*- 



Table 27 

Percent Passing Culture-Loaded and Non-Culture Loaded Items 
for Hispanic and Anglo Training and Control Groups 
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Table 28 QQ 
* Group Comparisons on Culture Loaded and Non-Culture Loaded 
Raven- Subscales for Hispanic and Anglo Training and Control Groups 
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non-significance was probably related to small sample size; however, the 
explained variance followed the expected pattern. For example, on items 
with an M-demand of five, significance occurred in only one instance 
(training effect for Hispanics). However, the 4.6% variance explained by 
ethnic group difference for control group students and the lack of variance 
explained by ethnic group difference with training group students follows 
the expected outcome. ' 

Culture-loaded items with an M-demand of two produced ethnic group 
differences in the control groups as did non-culture loaded items. The 
amount of explained variance, however, was 16% on culture-loaded items in 
comparison to 4% on non-culture loaded items. Additionally, training re- 
duced the ethnic group difference on culture-loaded items but not on non- 
culture loaded items. The effect of training was 14% higher on culture- 
loaded items than on non-culture loaded items, but was not significant. 
Finally, and perhaps more important, is that the gains for Hispanics on 
culture-loaded items was statistically greater than the gains for Anglos. ^ 
Even though the observed treatment effect was not significant; it is clear 
that the overall effect on culture-loaded items was consistent w 1th 
expectation. The interaction contraTt for Tion-cu^ loaded items was not 
significant. * 

On items with an M-demand of three, significant group differences oc- 
curred on culture-loaded items but not on non-culture loaded items. The 
difference did not hold up after training, and training was significant for 
Hispanic but not fo;- Anglo students. Only the non-significant interaction 
failed to support the culture-loaded hypothesis for this subscale. 



Four-M items followed the same pattern as items with an M-demand of 
three. Again, only the interaction was not as expected; all other con- 
trasts were consistent with expectation. 

Five-M items showed significant Training effect for Hispanics on both 
types of items. On culture-loaded items the variance explained was 17% in 
comparison to 12% for non -culture loaded items. 

Finally, the results for all students across all items were consistent 
with expectation except for ttte interaction. Overall, greater group dif- 
ference occurred on culture-loaded items and explained 14% of the variance 
in contrast with 3% explained on non-culture loaded items. Training had 
the effect of removing ethnic group differences on culture-loaded items but 
not on non -culture loaded items. 
Summary of Results on Culture-Loading 

Identification of items as culture-loaded was based on outcomes or 
results considered to* be consistent with the culture-loaded hypothesis. 
These are: 1) significant ethnic group differences on culture -loaded 
items; 2)* elimination or reduction of ethnic group differences on culture- 
loaded items, after training; 3) significant training effects for minority 
students but not Anglo students on culture-loaded items ; and, 4) greater 
training effects on culture-loaded items for minorities than for Anglos. 

In general these outcomes were supported on all the Raven subscales, 
and when subscales were combined to form culture-loaded and non-culture 
loaded totals. The outcome that occurred with the least frequency was a 
s/ignificant interaction effect, i.e., greater gains between minority train- 
Xing and control students than for Anglo training and control students. In 
total, the results were consistent with an expectation derived from a 
culture-loaded explanation. 



The results are also of significance because of the fact that the out^- 
comes were predicted for certain items and because more than one outcome 
(i.e., pattern) was correctly predicted. In cases where all expectations 
did not occur there was generally just one outcome that was not as pre- 
dicted. Admittedly, the procedure for selecting the items will tend to 
produce greater group differences on the .so-called culture-loaded items, 
since, after all, they were the items which showed a 25% or more difference 
at a given M-level in the first place. On the other hand, not all such 
items necessarily showed the greatest* group differences, nor did items rrot 
identified by this criteria not show group differences. The more signifi- 
cant fact is that the pattern in the five comparisons was generally sup- 
ported as predicted, especially greater gains (i.e., effects for particular 
groups but not others) suggests "the procedure's sound. It is not group 
differences alone, but the consistent overall' pattern that is significant. 

c 

The results indicate that in most instances items were correctly - 
identified in which training was likely to result in improved performance 
f&r minority students. In some cases Anglo performance was affected posi- 
tively too. Overall, the effect was to bring the performance of minority 
and majority groups closer together. — _ 

Finally, several comments are in order regarding the results present- 
ed. First, many statistical tests were performed and one would expect a 
certain percent to be significant by chance factors alone. It is pointed 
out, however, that the alpha level was controlled for each family of tests. 
This allows for some control of the Type I error rate. Perhaps more impor- 
tant is the pattern of significance. For example, had all statistical 
tests been performed at .05, then roughly 5% would be significant simply by 
chance. The important thing to observe is that significance occurred on 



precisely those outcomes predicted to be significant, and in the predicted 
direction. Too, outcomes predicted to not be significant generally were 
not. Thus, significance did not occur on a random basis. It is the con- 
sideration of both the control of the alpha level and the pattern of sig- 
nificant outcomes that support the results. At the same time, the question 
1s not really one of statistical significance in the first place. In the 
final analysis, 1t is the overall consistency in which'lthe outcomes hold 
together. 

Second, the reader is reminded of at least two sources of error that 
mitigate against the consistency of the observed outcomes. One is that it 
relied on the identification of afi item's M-demand by external sources. To 
the extent that there was error in this process, it surely effected the 
outcomes. Another is the identification of subjects 1 M-level. The FIT it- 
self could have introduced a source of error. The finding of a certain 
percentage of minority and Anglo students with M-levels less than or equal 
to zero, and the realization that training was effective for these stu- 
dents, suggest that some measure of error occurred in the respect. 

As a final note/ it 1s argued that children did not acquire the parti- 
cular skills necessary to take the 4 Raven test, nor were they taught. It 1s 
the authors' pos/tion, though not tested, that children have the required 
underlying test /taking skills and that we simply provided them with the 

chance to, learni how to apply them in this particular situation. And, while 

/ 

teachers note'd. changes 1n childrens* approach to many classroom tasks, we 
have na evidence as to the effects of the training outside of the Raven 
test. 

In this study, only eight hours of training was provided. During the 

training, we did not show the children how to solve a particular task. <?We 

* / 
J 

1 . \ 1V2 



simply pointed out the errors they were making, and encouraged them to de- 
velop alternative strategies , to the extent that' they became aware that the , 
one they were applying was either not sufficient or wrong. Often children 
* assumed their answer was correct and did not attempt to check their answers - 
or look for alternative solutions. Had the children not had the skills in 
the first pTace, it is unlikely, that the training could have provided them 
in such a short period of time. It is also unlikely that training could 
produce marked jumps irv developmental abilities. . 

Finally, a few words are necessary conceding Hispanic studehts in the 
study. Many of the students were in bilingual education programs, spoke 
limited or no English and thus had to receive training in Spanish. How- 
ever, this proved to be difficult because of the need for "on the spot 

r 

translations" as well as the use of a particular vocabulary. The training 
required words not commonly used by children or the trainers since many of 
them had not been "educated" in a Spanish school system. While every 
attempt was made to focus on communication, there was undoubtedly error or 
ambiguity introduced due to language. An attempt was originally made to 
gather language proficiency data in English and Spanish, but was discon-^ 
tinued because of cost and time constraints. Many of the children, how- 
ever*, spoke only Spanish and several more were' classified as limited En- 
glish speakers. The exact effect Sf this source of uncontrolled error is 
not known. It may have^contributed to performance differences- betwe*en 
Hispanics, Blacks, and Anglos, "and it undoubtedly contributed to poorer 
success in the effect of training. 
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CONCLUSION 

C 

Overall, it appears that the procedure, while admittedly not clear 
cut, did produce results which suggest that there is a form of cultural 
test bias in that culture-loaded items could be identified which produced 
lower scores for particular groups of children and which were differen- 
tially affected by training as predicted. The fact that specific items 
could be identified in advance, and, that training was differentially effec- 
tive sheds new light on test performance in general. The results should 
provide a better understanding of, and insights into what tests are measur- 
ing irrespective of ethnic background. 

Important too, is the fact that a particular cognitive developmental 
4 theory was shown to be useful and related to psychometric test performance. 
It is significant that evidence was provided for a specific source of dif- 
ferences in test performance. That is, that test performance error, 
whether one calls it lack of validity or bias, is in part due simply to 
test taking skills. In all, the training was really an exercise in provid- 
ing test taking skills to a particular group of children who, prior to 
training, did not know how to apply them to Ihe particular paper and pencil 
tests used. 

One cannot discuss the question of culture-loading and test bias with- 
out the issue of group differences arising. The results ' in this study in- 
dicate that while group differences on the Raven appear to be related to 
group differences in developmental level, this is not entirely the case. 
That is, even though minority students tend to score lower on developmental 
measures and to show an overall proportion of students obtaining a lower * 

;. ' 10 i 



M-level, two results suggest that 1t is test taking skills which affect 
paper-pencil tests 1n general. For example, when performance, 'of subjects , 

Q * 

i 

within M-level s is examined, ethnic group differences in, performance are 
observed in the control but not the training groups. Training group stu- 
dents showed virtually the same performance with M-level s, with minority 

* * 
students often scoring hfgher. In addition, there were not significant 

ethnic group differences between training group subjects ir) spite of the 
fact that more minority subjects had lower M-level s. ' In short it is Un- 
likely that training -of such short duration could have an effect on cogni- 
tive developmental level, or- have the differential effect in favor of mi- 
nority students unless there was a learned skill involved. It is also more 
likely that the M-level performance of minority students was also de- 
pressed. The higher performance of minority students with M-levels in the 
training group is consistent with this interpretation. In all, it is 
interpreted that test taking skills are a major source of variation, that 
these skills are learned, and that they can 6e overcome through exposure to 
the specific requirements of the test. 

These results suggest that the question of test bias is* /tot necessar- 

& ■ 

ily in the test itself. It is in the overall testing procedure, most im- 
portant of which is the assumption that^all children approach and solve a 
task according to the way in which the test publisher wishes. The indis- 
criminant use of tests without awareness or consideration of this factor 
will result in errors in validity^as well as bias. The conscientious test 

fa 

publisher, as well as test user should be well aware of these problems. 
This study provides evidence that at least this one source of error 'has not 
been considered, that is not commonly taken into account by, test developers 



4 



0 

or users, and that no amount of statistical or psychometric validation 1s 
likely to account ^for this error. The use of cognitive or some other 
theory of mental functioning is sorely needed in development and validation 
of tests. The researchers hope that this study will provide incentive as 
well as direction for future fair uses of tests. * 
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